文档
batch不会管理集群,只是会管理节点(自动扩缩)并运行任务。batch在eks中单独管理自身资源(不会影响其他pod,node和asg),最佳实践是创建一个单独的nsmespace
在batch创建计算环境和eks关联,此时计算环境和eks解耦,用户实际上是在抽象的计算环境中提交任务,任务到pod的转换交给了batch
job是batch的最小单元,eks上的batch job是和pod的映射。当提交jib时,job定义中的eks properties包括了能够在eks上运行的job的参数。
The
podPropertiesof a running job havepodNameandnodeNameparameters set for the current job attempt
aws batch describe-jobs --job 2d044787-c663-4ce6-a6fe-f2baf7e51b04
当向eks提交job时,batch会将job转换成pod定义。通过label和taints确保job运行在batch托管的节点上。eks上的job pod定义默认有以下设置
hostNetwork = true
dnsPolicy = ClusterFirstWithHostNet
使用cloudwatch logs监控eks上batch job的运行,https://docs.amazonaws.cn/en_us/batch/latest/userguide/batch-eks-cloudwatch-logs.html
pod中设置了label标识了batch job的jobid和计算环境的uuid。通过向pod注入环境变量为job runtime指明作业信息
kubectl describe pod aws-batch.14638eb9-d218-372d-ba5c-1c9ab9c7f2a1 -n my-aws-batch-namespace
在eks上运行基于gpu的作业,https://docs.amazonaws.cn/en_us/batch/latest/userguide/run-eks-gpu-workload.html
eks上内存和cpu的预留逻辑和GKE有区别,尤其在内存这块。batch的可能会受到预留资源的影响
配置工具(awscli,kuebctl),配置权限(访问eks),创建集群
注意:batch只支持公共访问的eks集群
在eks创建的资源包括
专用名称空间ns
clusterrolebinding,batchh监控node和pod
role,在ns中创建专用的角色,绑定用户aws-batch
创建iamidentitymapping映射到AWSServiceRoleForBatch(此处有bug尚未解决,这个角色的路径需要删除不饿能直接复制)
注意点:
BEST_FIT_PROGRESSIVE和SPOT_CAPACITY_OPTIMIZED分配策略2.8.6版本之后才支持创建eks计算环境使用命令行出现以下错误,通过--generate-cli-skeleton生成模板发现确实没有eksConfiguration配置,检查awscli版本,建议更新到2.8.6之后
Parameter validation failed:
Unknown parameter in input: "eksConfiguration", must be one of: computeEnvironmentName, type, state, unmanagedvCpus, computeResources, serviceRole, tags
再次执行,同时会创建asg,将网络配置填充到对应的启动模板中
可以在--compute-resources中配置 ec2Configuration.imageType 选择gpu类型实例
The image type to match with the instance type to select an AMI. The supported values are different for
ECSandEKSresources.ECS:ECS_AL2,ECS_AL2_NVIDIA,ECS_AL1,ECS
EKS:EKS,EKS_AL2,EKS_AL2_NVIDIA(例如P4 和 G4)
aws batch create-compute-environment --cli-input-json file://./batch-eks-compute-environment.json
{"computeEnvironmentName": "My-eks-CE1","computeEnvironmentArn": "arn:aws-cn:batch:cn-north-1:xxxxxxxx:compute-environment/My-eks-CE1"
}
aws batch describe-compute-environments
创建计算队列
aws batch create-job-queue --cli-input-json file://./batch-eks-job-queue.json
{"jobQueueName": "My-eks-JQ1","jobQueueArn": "arn:aws-cn:batch:cn-north-1:xxxxxx:job-queue/My-eks-JQ1"
}
创建任务定义,和ecs的任务定义类似,其中有eksProperties配pod参数,可以对pod的command等参数进行覆盖
aws batch register-job-definition --cli-input-json file://./batch-eks-job-definition.json
{"jobDefinitionName": "MyJobOnEks_Sleep"jobDefinitionArn": "arn:aws-cn:batch:cn-north-1:xxxxxxx:job-definition/MyJobOnEks_Sleep:2","revision": 2
}
创建简单任务并提交到作业队列,通过以下方式设置job调度
设定任务队列的优先级。为任务设定调度优先级
调度策略。在创建作业队列时未指定调度策略,作业调度程序默认使用先进先出(FIFO)策略
公平调度。使用共享标识标记job,调度器从共享标识的作业中选择使用率最低的作业
aws batch submit-job --job-queue My-eks-JQ1 \
> --job-definition MyJobOnEks_Sleep --job-name My-eks-Job1
{"jobArn": "arn:aws-cn:batch:cn-north-1:xxxxxxxxxxxx:job/fe10768a-a3b5-4596-93f1-b48083332e73","jobName": "My-eks-Job1","jobId": "fe10768a-a3b5-4596-93f1-b48083332e73"
}aws batch describe-jobs --job fe10768a-a3b5-4596-93f1-b48083332e73
控制台查看提交的任务json

此后新的m5.large实例启动,使用eks优化ami,配置添加了如下userdate
#!/bin/bash
set -exif [ -f /etc/aws-batch/batch.config ]; thenwhile read line; do[ $(expr "$line" : "^[A-Za-z_][0-9A-Za-z_]*=.*") -gt 0 ] && eval export $linedone < /etc/aws-batch/batch.config
fi[ -z "$AWS_BATCH_KUBELET_EXTRA_ARGS" ] && AWS_BATCH_KUBELET_EXTRA_ARGS=""/etc/eks/bootstrap.sh worklearn \--kubelet-extra-args ' '"$AWS_BATCH_KUBELET_EXTRA_ARGS"' ... '
节点加入集群失败,出现如下错误Failed to contact API server when waiting for CSINode publishing: Unauthorized
kubelet_node_status.go:70] "Attempting to register node" node="ip-192-168-30-56.cn-north-1.compute.internal"
kubelet.go:2469] "Error getting node" err="node \"ip-192-168-30-56.cn-north-1.compute.internal\" not found"
kubelet_node_status.go:92] "Unable to register node with API server" err="Unauthorized" node="ip-192-168-30-56.cn-north-1.compute.internal"
kubelet.go:2469] "Error getting node" err="node \"ip-192-168-30-56.cn-north-1.compute.internal\" not found"
csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: Unauthorized
最终发现是忘记将node的角色加入eks集群的aws-auth configmap中,加入一下
- groups:- system:bootstrappers- system:nodesrolearn: arn:aws-cn:iam::xxxxxx:role/myEKSNodeRole
这里总结一下batch 节点启动逻辑,batch会在配置的子网中通过dry-run的方式确认实例能够正常启动,随后batch修改ags的启动模板中的desired count数量,将节点启动。在cloudtrail中会看到以下意料中错误
An error occurred (InvalidParameter) when calling the RunInstances operation: Security group sg-0b1e6f21a1a04d078 and subnet subnet-027025e9d9760acdd belong to different networks.
修改之后计算环境启动,并且任务成功执行
node配置如下,节点通过污点排斥其他pod
apiVersion: v1
kind: Node
metadata:annotations:alpha.kubernetes.io/provided-node-ip: 192.168.15.116csi.volume.kubernetes.io/nodeid: '{"efs.csi.aws.com":"i-xxxxxxxx"}'node.alpha.kubernetes.io/ttl: "0"volumes.kubernetes.io/controller-managed-attach-detach: "true"labels:batch.amazonaws.com/compute-environment-revision: "4"batch.amazonaws.com/compute-environment-uuid: 6c63cab8-8b00-3021-bb3d-fb990cef9c60beta.kubernetes.io/arch: amd64beta.kubernetes.io/instance-type: m5.xlargebeta.kubernetes.io/os: linuxfailure-domain.beta.kubernetes.io/region: cn-north-1failure-domain.beta.kubernetes.io/zone: cn-north-1ak8s.io/cloud-provider-aws: f48c3b996b9bce33df562d04d847dfafkubernetes.io/arch: amd64kubernetes.io/hostname: ip-192-168-15-116.cn-north-1.compute.internalkubernetes.io/os: linuxnode.kubernetes.io/instance-type: m5.xlargetopology.kubernetes.io/region: cn-north-1topology.kubernetes.io/zone: cn-north-1aname: ip-192-168-15-116.cn-north-1.compute.internalresourceVersion: "34308242"uid: ffc8beb4-f326-4135-a471-e0b1d9511012
spec:providerID: aws:///cn-north-1a/i-xxxxxxxtaints:- effect: NoSchedulekey: batch.amazonaws.com/batch-node- effect: NoExecutekey: batch.amazonaws.com/batch-node
pod配置如下,label指明了计算环境和任务id,任务pod上配置了污点容忍和batch环境变量。默认网络模式为hostNetwork=true和dnsPolicy=ClusterFirstWithHostNet
任务制定完毕pod立刻被清除,可以配置cloudwatch agent收集日志(使用fluentbit组件需要增加污点容忍配置)
apiVersion: v1
kind: Pod
metadata:annotations:kubernetes.io/psp: eks.privilegedlabels:batch.amazonaws.com/compute-environment-uuid: 6c63cab8-8b00-3021-bb3d-fb990cef9c60batch.amazonaws.com/job-id: a5b695fc-8847-4a32-bfb5-99c6cf66c1dfbatch.amazonaws.com/node-uid: ffc8beb4-f326-4135-a471-e0b1d9511012name: aws-batch.b08aaab0-59e6-39b7-ada4-bbae690412b2namespace: my-aws-batch-namespace
spec:containers:- command:- sleep- "60"env:- name: AWS_BATCH_JOB_KUBERNETES_NODE_UIDvalue: ffc8beb4-f326-4135-a471-e0b1d9511012- name: AWS_BATCH_JOB_IDvalue: a5b695fc-8847-4a32-bfb5-99c6cf66c1df- name: AWS_BATCH_JQ_NAMEvalue: My-eks-JQ1- name: AWS_BATCH_JOB_ATTEMPTvalue: "1"- name: AWS_BATCH_CE_NAMEvalue: My-eks-CE1image: public.ecr.aws/amazonlinux/amazonlinux:2imagePullPolicy: IfNotPresentname: defaultresources:limits:cpu: "1"memory: 1Girequests:cpu: "1"memory: 1GiterminationMessagePath: /dev/termination-logterminationMessagePolicy: FilevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: kube-api-access-xddn2readOnly: truednsPolicy: ClusterFirstenableServiceLinks: truehostNetwork: truenodeName: ip-192-168-15-116.cn-north-1.compute.internalpreemptionPolicy: PreemptLowerPrioritypriority: 0restartPolicy: NeverschedulerName: default-schedulersecurityContext: {}serviceAccount: defaultserviceAccountName: defaultterminationGracePeriodSeconds: 30tolerations:- effect: NoSchedulekey: batch.amazonaws.com/batch-nodeoperator: Exists- effect: NoExecutekey: batch.amazonaws.com/batch-nodeoperator: Exists- effect: NoExecutekey: node.kubernetes.io/not-readyoperator: ExiststolerationSeconds: 300- effect: NoExecutekey: node.kubernetes.io/unreachableoperator: ExiststolerationSeconds: 300volumes:- name: kube-api-access-xddn2projected:defaultMode: 420sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:items:- key: ca.crtpath: ca.crtname: kube-root-ca.crt- downwardAPI:items:- fieldRef:apiVersion: v1fieldPath: metadata.namespacepath: namespace