Network Reliability Engineering Community

Changes in syringe.yml for Antidote on Local Machine

After building an image using docker build and pushing this image to Docker Hub, how do we declare this in syringe.yml file such that the changes are reflected while building the lesson locally on our machines? Because currently, “image” under utilities and devices in syringe.yml is tagged to “antidotelabs/salt”.

Welcome!

The value for the image field is directly passed to Kubernetes (and as a result, Docker), so this can be any value you want. Our policy for the NRE Labs curriculum is that the source for your images is included in the pull request, so that we can build the images within the antidotelabs organization, but this is a policy for our production site’s curriculum. There’s no filtering within Antidote itself, so during development, you should build your images and push them to your own username in docker hub, and simply refer to them in your syringe.yaml files (i.e. image: ashwini/salt)

If you open a PR to https://github.com/nre-learning/nrelabs-curriculum, then we’ll make sure we have everything we need to build the image on our side. But until then, it’s all yours.

Matt,

I tried the way you mentioned above by uploading the image on my docker hub account and changing the syringe.yaml file ( i.e. image: shwetak02/salt1) and reloaded the selfmedicate.sh. After reloading the script, when I refresh the GUI on http://antidote-local:30001/labs/?lessonId=30&lessonStage=4, it gives timeout error (screenshot attached).

52%20AM

Do I need to start the selfmedicate.sh script and not reload?

To answer your last question, no you should never have to use start more than once. If all you’re trying to do is load a revised version of the lesson (including the changes you made), reload is sufficient.

Regarding this error, it’s possible there was some kind of issue pulling the new image. Take a look at https://antidoteproject.readthedocs.io/en/latest/building/local.html#lesson-times-out-while-loading - there are a few troubleshooting tips there. To figure out if there are issues getting the image itself, the kubectl describe command will help. Try something like this:

kubectl get ns will give you the list of active namespaces. If you have a lesson running, you should see one that looks like 30-abcdefghijkl. Copy this to your clipboard.

Then, you can run kubectl get pods -n=<namespace on your clipboard> to see the pods running for that lesson. These should all say Running, or Completed.

For any that don’t, run `kubectl describe pod -n= and paste the result(s) here.

If that all checks out, the only other step might be to send the Syringe logs. Let’s start with all that, and move from there.

1 Like

Thanks a lot Matt. We will try this and let you know.

Hi Matt,

I still am facing the Timeout Error. I followed your suggestions above. I believe its because my Docker Hub repository is private. Do you suggest i create a public repo and re-try? But, I did export $DOCKER_ID_USER=“ashwinir” and $docker login. Please also see the output below:

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl get ns
NAME STATUS AGE
30-d3uwcnplgcasnzi3-ns Active 8m24s
default Active 21d
kube-public Active 21d
kube-system Active 21d

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl get pods -n=30-d3uwcnplgcasnzi3-ns
NAME READY STATUS RESTARTS AGE
salt1 0/1 ImagePullBackOff 0 10m
vqfx1 0/1 ImagePullBackOff 0 10m

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl describe pod -n=30-d3uwcnplgcasnzi3-ns
Events:
Type Reason Age From Message


Normal Scheduled 11m default-scheduler Successfully assigned 30-d3uwcnplgcasnzi3-ns/vqfx1 to minikube
Normal Pulling 11m kubelet, minikube pulling image “bash”
Normal Pulled 11m kubelet, minikube Successfully pulled image “bash”
Normal Created 11m kubelet, minikube Created container
Normal Started 11m kubelet, minikube Started container
Normal Pulling 9m34s (x4 over 11m) kubelet, minikube pulling image “ashwinir/saltljc”
Warning Failed 9m34s (x4 over 11m) kubelet, minikube Failed to pull image “ashwinir/saltljc”: rpc error: code = Unknown desc = Error response from daemon: pull access denied for ashwinir/saltljc, repository does not exist or may require ‘docker login’
Warning Failed 9m34s (x4 over 11m) kubelet, minikube Error: ErrImagePull
Warning Failed 9m22s (x5 over 10m) kubelet, minikube Error: ImagePullBackOff
Normal BackOff 55s (x43 over 10m) kubelet, minikube Back-off pulling image “ashwinir/saltljc”

I would recommend making it public, or following documentation for using private repos from Kubernetes. Regular docker commands may not be sufficient.

Please see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

Thanks for your help, Matt. Stage 4 is now up and running on my local instance. Followed the troubleshoot steps and it worked.

Hi Matt,

Thanks for the link. I changed my repo in dcoker hub to public, but was getting the syringe container crashloopbackoff error. I followed the troubleshooting you mentioned on that discussion, but I am getting the following error when going from stage 1 to stage 2 in the SaltStack lesson.

49%20PM

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl get pods
NAME READY STATUS RESTARTS AGE
antidote-web-57f98b78d4-cxwfm 2/2 Running 0 3h3m
nginx-ingress-controller-6f575d4f84-vlxxd 1/1 Running 0 3h3m
syringe-65ddb769c4-r2tqh 1/1 Running 0 3h3m

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl get ns
NAME STATUS AGE
30-gvg6nogietpnv7ck-ns Active 24m
default Active 3h5m
kube-public Active 3h5m
kube-system Active 3h5m

ashwini-mbp:antidote-selfmedicate ashwini$ kubectl get pods -n=30-gvg6nogietpnv7ck-ns
NAME READY STATUS RESTARTS AGE
config-vqfx1-2fxlh 0/1 Error 0 18m
config-vqfx1-68m8b 0/1 Error 0 18m
config-vqfx1-7jr4j 0/1 Error 0 8m23s
config-vqfx1-7sjdl 0/1 Error 0 18m
config-vqfx1-7xsnh 0/1 Error 0 4m23s
config-vqfx1-bb7mh 0/1 Error 0 16m
config-vqfx1-cbqlc 0/1 Error 0 7m3s
config-vqfx1-h5scw 0/1 Error 0 10m
config-vqfx1-hjwjn 0/1 Error 0 9m43s
config-vqfx1-kfmp6 0/1 Error 0 10m
config-vqfx1-l4km8 0/1 Error 0 19m
config-vqfx1-xtwtn 0/1 Error 0 9m3s
salt1 1/1 Running 0 19m
vqfx1 1/1 Running 0 19m

The reason you’re seeing so many config pods is because they’re all failing. Taking the logs of one of them should tell you the problem - more than likely there was a problem configuring the device.