Trouble Shooting

[Kubernetes] dial tcp 127.0.0.1:10248: connect: connection refused.

happyso 2022. 7. 24. 22:56

[문제]

kubeadm init 실행시 다음과 같은 오류가 발생했다

 

[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.

Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

 

 그래서 systemctl status kubelet.service 로 상태를 확인해보니 fail인 상태였다.

 

kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             \u2514\u250010-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Sun 2022-07-24 12:14:57 UTC; 4s ago
       Docs: https://kubernetes.io/docs/home/
    Process: 46901 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, s>
   Main PID: 46901 (code=exited, status=1/FAILURE)
        CPU: 61ms

 

[해결 과정]

cgroup의 문제일 거라 생각하여 container runtime과 kubeadm의 cgroup을 아무리 맞춰줘도 같은 오류가 발생했다.......

결론은 swap off 를 해주니 해결되었다...! (허무)

https://github.com/kubernetes/kubernetes/issues/54542

 

kubelet.service fail to start up · Issue #54542 · kubernetes/kubernetes

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug What happened: kubelet.service fail to start up What you expected to happen: kubelet.service start up success How to reproduce it (as minimally a...

github.com

 

명싱하자 SWAP OFF !!!!!

쿠버네티스를 실행하기 위해서는 OS의 SWAP을 꺼줘야한다.

sudo swapoff -a
vi /etc/fstab  # SWAP이 정의된 줄을 '#'으로 주석처리해준다.