Documentation Index Fetch the complete documentation index at: https://mintlify.com/flyteorg/flyte/llms.txt
Use this file to discover all available pages before exploring further.
This guide covers the most common issues encountered when running Flyte. Before diving in, collect the following diagnostic information:
# Get the pod status and events
kubectl describe pod < PodNam e > -n < namespac e >
# Get pod logs
kubectl logs < PodNam e > -n < namespac e >
<PodName> is the node execution string shown in the Flyte UI. <namespace> corresponds to the Flyte project-domain, e.g. flytesnacks-development.
The Flyte UI shows node execution IDs like ab5mg9lzgth62h82qprp-n0-0. This is also the pod name in Kubernetes.
Installation and sandbox issues
Cannot connect to the Docker daemon
Error :Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock.
Is the docker daemon running?
This occurs when running Docker Desktop instead of the native Docker engine on Linux. The socket path differs. Fix for Docker Desktop on macOS :sudo ln -s ~/Library/Containers/com.docker.docker/Data/docker.raw.sock /var/run/docker.sock
Fix for Docker Desktop on Linux :sudo ln -s ~/.docker/desktop/docker.sock /var/run/docker.sock
Fix for Rancher Desktop on Linux :sudo ln -s ~/.rd/docker.sock /var/run/docker.sock
If you are using another container runtime, link its socket to /var/run/docker.sock.
Insufficient CPU when starting sandbox
Error :message: '0/1 nodes are available: 1 Insufficient cpu.
preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.'
This is common on macOS with Docker Desktop. Fix : Open Docker Desktop settings and increase resources to a minimum of 4 CPU cores and 3 GB RAM .
TLS certificate not trusted (x509 error)
Error :authentication handshake failed: x509: "Kubernetes Ingress Controller Fake Certificate" certificate is not trusted
This occurs when TLS is not properly configured in a flyte-core deployment. Fix : Enable TLS in your values.yaml:ingress :
host : example.com
separateGrpcIngress : true
separateGrpcIngressAnnotations :
ingress.kubernetes.io/backend-protocol : "grpc"
annotations :
ingress.kubernetes.io/app-root : "/console"
ingress.kubernetes.io/default-backend-redirect : "/console"
kubernetes.io/ingress.class : haproxy
tls :
enabled : true
Also update your flytectl config to disable insecure mode: admin :
endpoint : dns:///example.com
authType : Pkce
insecure : false
insecureSkipVerify : true
Wrong SSL version (OPENSSL_internal:WRONG_VERSION_NUMBER)
Error :OPENSSL_internal:WRONG_VERSION_NUMBER
For flyte-binary : Verify that the endpoint name in your config.yaml matches the DNS names in the SSL certificate (whether self-signed or CA-issued).For sandbox : Verify the FLYTECTL_CONFIG environment variable points to the correct config file:export FLYTECTL_CONFIG =~ /. flyte / config-sandbox . yaml
Execution failures
OOMKilled — container terminated with exit code 137
Error :terminated with exit code (137). Reason [OOMKilled]
The container exceeded its memory limit. Fix 1 : For Helm deployments, update task resource defaults in your values.yaml:inline :
task_resources :
defaults :
cpu : 100m
memory : 100Mi
storage : 100Mi
limits :
memory : 1Gi
Fix 2 : Override resource limits directly in your task code:from flytekit import Resources, task
@task ( limits = Resources( mem = "256Mi" ))
def your_task (...):
...
Fix 3 : For EKS deployments, adjust limits in the inline section of eks-production.yaml. Use the most recent Helm charts .
Error : Kubernetes cannot pull the task container image.Fix 1 : If your environment uses a network proxy, pass the proxy configuration when starting the sandbox:flytectl demo start --env HTTP_PROXY= < your-proxy-I P >
Fix 2 : Never use latest as an image tag. Kubernetes changes the pull policy to Always for latest, forcing a pull on every pod start. Use a specific version tag:@task ( container_image = "my-registry.example.com/my-image:v1.2.3" )
def my_task (...):
...
Fix 3 : If the registry requires authentication, create a Kubernetes image pull secret and configure it in your pod template.
ModuleNotFoundError in container tasks
Error :ModuleNotFoundError: No module named 'mymodule'
Cause : The Python module is not on the container’s path.Fix : If using a custom Docker image, ensure:
Your Dockerfile is at the same level as the flyte directory.
An empty __init__.py exists in your project folder.
Expected directory layout: myflyteapp/
├── Dockerfile
├── docker_build_and_tag.sh
└── flyte/
├── __init__.py
└── workflows/
├── __init__.py
└── example.py
Spark task error: JavaPackage is not callable
Error :FlyteScopedUserException: 'JavaPackage' object is not callable
Cause : The spark plugin is not enabled in the FlytePropeller configuration.Fix : Add spark to the enabled-plugins list in your config YAML:tasks :
task-plugins :
enabled-plugins :
- container
- sidecar
- K8S-ARRAY
- spark
default-for-task-types :
- container : container
- container_array : K8S-ARRAY
Dynamic workflow: failed + succeeded + running inconsistent state
Error : An execution appears stuck or reports an inconsistent failed + succeeded + running state.Cause : A malformed dynamic workflow was processed by FlytePropeller. This was a known bug fixed in v1.16.4.Fix : Upgrade to Flyte v1.16.4 or later. If you cannot upgrade immediately, use RecoverExecution to resume from the last known good state:grpcurl -plaintext \
-d '{"id": {"project": "flytesnacks", "domain": "development", "name": "<execution-id>"}}' \
localhost:81 flyteidl.service.AdminService/RecoverExecution
Storage and data issues
AccessDenied when writing to S3 (EKS deployment)
Error :An error occurred (AccessDenied) when calling the PutObject operation
Cause : The Kubernetes service account Flyte uses does not have the correct IAM role annotation for IRSA (IAM Roles for Service Accounts).Fix 1 : Verify the service account annotation:kubectl describe sa < my-flyte-s a > -n < flyte-namespac e >
Expected output should include: Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::<account-id>:role/flyte-system-role
Fix 2 : If the annotation is missing, add it manually:kubectl annotate serviceaccount -n < flyte-namespac e > < my-flyte-s a > \
eks.amazonaws.com/role-arn=arn:aws:iam:: < account-i d > :role/ < flyte-iam-rol e >
Refer to the community-maintained Flyte the Hard Way guide for full EKS IAM configuration.
Cannot access Minio in local sandbox
When running the local sandbox, Minio is available at: For debugging, set these environment variables when running tasks locally: export FLYTE_AWS_ENDPOINT = "http://localhost:30002"
export FLYTE_AWS_ACCESS_KEY_ID = "minio"
export FLYTE_AWS_SECRET_ACCESS_KEY = "miniostorage"
Authentication issues
Unauthenticated errors in flytectl
Error : rpc error: code = UnauthenticatedFix 1 : Re-authenticate:flytectl config init --host flyte.example.com
Fix 2 : Verify your config file has the correct auth settings:admin :
endpoint : dns:///flyte.example.com
authType : Pkce # or ClientSecret for service accounts
insecure : false
Fix 3 : For development/sandbox, you can disable auth entirely:admin :
endpoint : dns:///localhost:30080
insecure : true
Auth config not found after demo start
After running flytectl demo start, the sandbox config is written to ~/.flyte/config-sandbox.yaml. Export it: export FLYTECTL_CONFIG =~ /. flyte / config-sandbox . yaml
Add this to your shell profile to persist it across sessions.
Getting more help
GitHub Issues Open a bug report or feature request.
Slack Community Get real-time help in the #ask-the-community channel.
GitHub Discussions Ask questions or share ideas with the community.
Documentation Browse the official Flyte documentation.