Same node pod lookup fails

11/23/2017

i have started noticing in my 5 node 1.7.10 k8s cluster, pods scheduled on the same node cant communicate with each other.. nslookup fails to any service from any pod on that node but works fine when run from a different node.. not sure if this is a kube dns/flannel issue, any pointers on how to debug this?

To fix this, I have to reschedule these pods on a brand new node. I have tried restarting flannel, it didnt help. Next time this happens, will restart kube-dns pods.

-- example

to give an example, we had kafka and zk (running fine, used by another kafka pod on a different node) scheduled on the same node. And kafka was not able to find zk. nslookup failed from that kafka pod but works fine from any other pod. This is not a kafka problem as we have this issue on other nodes between different pods. How do i check the kube-dns entries - kube-dns logs seems to show everything setup just fine and no errors

```

    [2017-11-22 12:00:56,194] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server 'zookeeper:2181' with timeout of 6000 ms
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1233)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
    at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:106)
    at kafka.utils.ZkUtils$.apply(ZkUtils.scala:88)
    at kafka.server.KafkaServer.initZk(KafkaServer.scala:329)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:187)
    at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:39)
    at kafka.Kafka$.main(Kafka.scala:67)
    at kafka.Kafka.main(Kafka.scala)
    [2017-11-22 12:00:56,208] INFO shutting down (kafka.server.KafkaServer)

```

-- padlar
dns
flannel
kubernetes

0 Answers