简介
AWS ParallelCluster是AWS支持的开源集群管理工具,可帮助您部署和管理高性能计算 (HPC) 集群。ParallelCluster是建立在开源 CfnCluster 项目的基础上,AWS ParallelCluster可以快速构建 HPC 计算环境。自动设置所需的计算资源和共享文件系统。可以在AWS ParallelCluster环境中使用批处理调度器AWS Batch或Slurm,旧版本ParallelCluster还支持PBS和SGE。AWS ParallelCluster便于快速启动概念验证部署和生产部署。也可以在 AWS ParallelCluster 基础之上构建更高级别的工作流程,例如 CFD高性能计算。
AWS ParallelCluster可以使用多个AWS HPC服务,例如图形展示的NICE DCV和高性能计算文件系统FSX Lustre。DCV可以使用在CFD前后处理上,典型的场景是工程师可以通过DCV使用CFD Post打开最终的计算模型,进行查看验证。也可以通过ICEM进行前处理操作。FSX Lustre提供符合高性能计算需求的带宽和延迟。
NICE DCV 是一种高性能远程显示协议,为客户提供一种安全的方式,可以在各种网络条件下,将远程桌面和应用程序从任何云或数据中心流式传输到任何设备。借助 NICE DCV 和 Amazon EC2,客户可以在 EC2 实例上远程运行图形密集型应用程序,并将结果流式传输到客户端计算机上,从而无需昂贵的专用工作站。跨多种 HPC 工作负载的客户使用 NICE DCV 满足其远程可视化要求。在 Amazon EC2 上使用 NICE DCV 不会产生任何额外费用。您只需为用于运行和存储工作负载的 EC2 资源付费。
FSx for Lustre 使启动和运行流行的高性能 Lustre 文件系统变得轻松且经济高效。您可以使用 Lustre 来处理如机器学习、高性能计算 (HPC)、视频处理和财务建模。
开源 Lustre 文件系统专为需要快速存储的应用程序而设计。Lustre 旨在解决快速、廉价地处理世界上不断增长的数据集的问题。这是一个广泛使用的文件系统,专为世界上速度最快的计算机而设计。它提供亚毫秒级的延迟、高达数百 GB的吞吐量以及高达数百万 IOPS。
作为一项完全托管的服务,Amazon FSx 可迅速地将 Lustre 用于存储速度至关重要的工作负载。FSx for Lustre 消除了设置和管理 Lustre 文件系统的传统复杂性,使您能够在几分钟内启动高性能文件系统。它还提供了多种部署选项,因此您可以根据需求优化成本。
FSx for Lustre 符合 POSIX 标准,因此您可以使用当前基于 Linux 的应用程序,而无需进行任何更改。可以像任何文件系统在 Linux 操作系统中一样工作。它还提供先写后读一致性,并支持文件锁定。
ANSYS Fluent是国际上比较流行的商用CFD软件包,在美国的市场占有率为60%,凡是和流体、热传递和化学反应等有关的工业均可使用。它具有丰富的物理模型、先进的数值方法和强大的前后处理功能,在航空航天、汽车设计、石油天然气和涡轮机设计等方面都有着广泛的应用。
ParallelCluster 3 集成了Slurm和Batch作业调度系统,Slurm是适用于CFD作业调度。Slurm(Simple Linux Utility for Resource Management,http://slurm.schedmd.com/ )是开源的、具有容错性和高度可扩展的Linux集群超级计算系统资源管理和作业调度系统。超级计算系统可利用Slurm对资源和作业进行管理,以避免相互干扰,提高运行效率。所有需运行的作业,无论是用于程序调试还是业务计算,都可以通过交互式并行 srun 、批处理式 sbatch 或分配式 salloc 等命令提交,提交后可以利用相关命令查询作业状态等。
方案部署
安装ParallelCluster
前提条件
AWS ParallelCluster需要 Python 3.6 或更高版本。如果还没有安装,需要先从https://www.python.org/downloads/ 下载兼容的版本,进行安装。
$ python3
Python 3.7.10 (default, Jun 3 2021, 00:02:01)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
安装虚拟环境virtualenv
$ python3 -m pip install --upgrade pip
Defaulting to user installation because normal site-packages is not writeable
Collecting pip
Downloading pip-22.2.1-py3-none-any.whl (2.0 MB)
|████████████████████████████████| 2.0 MB 44.7 MB/s
Installing collected packages: pip
Successfully installed pip-22.2.1
$ python3 -m pip install --user --upgrade virtualenv
Collecting virtualenv
Downloading virtualenv-20.16.2-py2.py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 89.1 MB/s eta 0:00:00
Collecting distlib<1,>=0.3.1
Downloading distlib-0.3.5-py2.py3-none-any.whl (466 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 467.0/467.0 kB 71.2 MB/s eta 0:00:00
Collecting importlib-metadata>=0.12
Downloading importlib_metadata-4.12.0-py3-none-any.whl (21 kB)
Collecting platformdirs<3,>=2
Downloading platformdirs-2.5.2-py3-none-any.whl (14 kB)
Collecting filelock<4,>=3.2
Downloading filelock-3.7.1-py3-none-any.whl (10 kB)
Collecting typing-extensions>=3.6.4
Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB)
Collecting zipp>=0.5
Downloading zipp-3.8.1-py3-none-any.whl (5.6 kB)
Installing collected packages: distlib, zipp, typing-extensions, platformdirs, filelock, importlib-metadata, virtualenv
Successfully installed distlib-0.3.5 filelock-3.7.1 importlib-metadata-4.12.0 platformdirs-2.5.2 typing-extensions-4.3.0 virtualenv-20.16.2 zipp-3.8.1
创建virtualenv,并命名
$ python3 -m virtualenv ~/apc-ve
created virtual environment CPython3.7.10.final.0-64 in 850ms
creator CPython3Posix(dest=/home/ec2-user/apc-ve, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/ec2-user/.local/share/virtualenv)
added seed packages: pip==22.2.1, setuptools==63.2.0, wheel==0.37.1
activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
这个时候会在当前目录下生成文件夹 apc-ve
激活新的virtualenv
$ source ~/apc-ve/bin/activate
在虚拟环境下安装AWS ParallelCluster
$ python3 -m pip install --upgrade "aws-parallelcluster"
Collecting aws-parallelcluster
Downloading aws_parallelcluster-3.2.0-py3-none-any.whl (424 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 425.0/425.0 kB 37.8 MB/s eta 0:00:00
Collecting aws-cdk.aws-batch!=1.153.0,~=1.137
Downloading aws_cdk.aws_batch-1.167.0-py3-none-any.whl (333 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 333.6/333.6 kB 52.3 MB/s eta 0:00:00
Collecting jmespath~=0.10
Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting aws-cdk.aws-cloudwatch!=1.153.0,~=1.137
Downloading aws_cdk.aws_cloudwatch-1.167.0-py3-none-any.whl (379 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 379.1/379.1 kB 44.9 MB/s eta 0:00:00
Collecting aws-cdk.core!=1.153.0,~=1.137
Downloading aws_cdk.core-1.167.0-py3-none-any.whl (1.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 95.1 MB/s eta 0:00:00
……
Collecting certifi>=2017.4.17
Downloading certifi-2022.6.15-py3-none-any.whl (160 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.2/160.2 kB 41.5 MB/s eta 0:00:00
Collecting exceptiongroup
Downloading exceptiongroup-1.0.0rc8-py3-none-any.whl (11 kB)
Collecting six>=1.5
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: publication, zipp, urllib3, typing-extensions, typeguard, tabulate, six, PyYAML, pyrsistent, pyparsing, pkgutil-resolve-name, MarkupSafe, jmespath, itsdangerous, inflection, idna, exceptiongroup, charset-normalizer, certifi, attrs, werkzeug, requests, python-dateutil, packaging, jinja2, importlib-resources, importlib-metadata, cattrs, marshmallow, jsonschema, jsii, click, botocore, s3transfer, flask, constructs, clickclick, aws-cdk.region-info, aws-cdk.cloud-assembly-schema, connexion, boto3, aws-cdk.cx-api, aws-cdk.core, aws-cdk.aws-signer, aws-cdk.aws-sam, aws-cdk.aws-imagebuilder, aws-cdk.aws-iam, aws-cdk.aws-codestarnotifications, aws-cdk.aws-acmpca, aws-cdk.assets, aws-cdk.aws-kms, aws-cdk.aws-events, aws-cdk.aws-codeguruprofiler, aws-cdk.aws-cloudwatch, aws-cdk.aws-autoscaling-common, aws-cdk.aws-ssm, aws-cdk.aws-sqs, aws-cdk.aws-s3, aws-cdk.aws-ecr, aws-cdk.aws-applicationautoscaling, aws-cdk.aws-sns, aws-cdk.aws-s3-assets, aws-cdk.aws-ecr-assets, aws-cdk.aws-logs, aws-cdk.aws-codecommit, aws-cdk.aws-stepfunctions, aws-cdk.aws-kinesis, aws-cdk.aws-ec2, aws-cdk.aws-fsx, aws-cdk.aws-elasticloadbalancing, aws-cdk.aws-efs, aws-cdk.aws-lambda, aws-cdk.aws-sns-subscriptions, aws-cdk.aws-secretsmanager, aws-cdk.aws-cloudformation, aws-cdk.custom-resources, aws-cdk.aws-codebuild, aws-cdk.aws-route53, aws-cdk.aws-globalaccelerator, aws-cdk.aws-dynamodb, aws-cdk.aws-certificatemanager, aws-cdk.aws-elasticloadbalancingv2, aws-cdk.aws-cognito, aws-cdk.aws-cloudfront, aws-cdk.aws-servicediscovery, aws-cdk.aws-autoscaling, aws-cdk.aws-apigateway, aws-cdk.aws-route53-targets, aws-cdk.aws-autoscaling-hooktargets, aws-cdk.aws-ecs, aws-cdk.aws-batch, aws-parallelcluster
Successfully installed MarkupSafe-2.1.1 PyYAML-5.4.1 attrs-21.4.0 aws-cdk.assets-1.167.0 aws-cdk.aws-acmpca-1.167.0 aws-cdk.aws-apigateway-1.167.0 aws-cdk.aws-applicationautoscaling-1.167.0 aws-cdk.aws-autoscaling-1.167.0 aws-cdk.aws-autoscaling-common-1.167.0 aws-cdk.aws-autoscaling-hooktargets-1.167.0 aws-cdk.aws-batch-1.167.0 aws-cdk.aws-certificatemanager-1.167.0 aws-cdk.aws-cloudformation-1.167.0 aws-cdk.aws-cloudfront-1.167.0 aws-cdk.aws-cloudwatch-1.167.0 aws-cdk.aws-codebuild-1.167.0 aws-cdk.aws-codecommit-1.167.0 aws-cdk.aws-codeguruprofiler-1.167.0 aws-cdk.aws-codestarnotifications-1.167.0 aws-cdk.aws-cognito-1.167.0 aws-cdk.aws-dynamodb-1.167.0 aws-cdk.aws-ec2-1.167.0 aws-cdk.aws-ecr-1.167.0 aws-cdk.aws-ecr-assets-1.167.0 aws-cdk.aws-ecs-1.167.0 aws-cdk.aws-efs-1.167.0 aws-cdk.aws-elasticloadbalancing-1.167.0 aws-cdk.aws-elasticloadbalancingv2-1.167.0 aws-cdk.aws-events-1.167.0 aws-cdk.aws-fsx-1.167.0 aws-cdk.aws-globalaccelerator-1.167.0 aws-cdk.aws-iam-1.167.0 aws-cdk.aws-imagebuilder-1.167.0 aws-cdk.aws-kinesis-1.167.0 aws-cdk.aws-kms-1.167.0 aws-cdk.aws-lambda-1.167.0 aws-cdk.aws-logs-1.167.0 aws-cdk.aws-route53-1.167.0 aws-cdk.aws-route53-targets-1.167.0 aws-cdk.aws-s3-1.167.0 aws-cdk.aws-s3-assets-1.167.0 aws-cdk.aws-sam-1.167.0 aws-cdk.aws-secretsmanager-1.167.0 aws-cdk.aws-servicediscovery-1.167.0 aws-cdk.aws-signer-1.167.0 aws-cdk.aws-sns-1.167.0 aws-cdk.aws-sns-subscriptions-1.167.0 aws-cdk.aws-sqs-1.167.0 aws-cdk.aws-ssm-1.167.0 aws-cdk.aws-stepfunctions-1.167.0 aws-cdk.cloud-assembly-schema-1.167.0 aws-cdk.core-1.167.0 aws-cdk.custom-resources-1.167.0 aws-cdk.cx-api-1.167.0 aws-cdk.region-info-1.167.0 aws-parallelcluster-3.2.0 boto3-1.24.44 botocore-1.27.44 cattrs-22.1.0 certifi-2022.6.15 charset-normalizer-2.1.0 click-8.1.3 clickclick-20.10.2 connexion-2.13.1 constructs-3.4.58 exceptiongroup-1.0.0rc8 flask-2.2.0 idna-3.3 importlib-metadata-4.12.0 importlib-resources-5.9.0 inflection-0.5.1 itsdangerous-2.1.2 jinja2-3.1.2 jmespath-0.10.0 jsii-1.63.2 jsonschema-4.9.0 marshmallow-3.17.0 packaging-21.3 pkgutil-resolve-name-1.3.10 publication-0.0.3 pyparsing-3.0.9 pyrsistent-0.18.1 python-dateutil-2.8.2 requests-2.28.1 s3transfer-0.6.0 six-1.16.0 tabulate-0.8.10 typeguard-2.13.3 typing-extensions-4.3.0 urllib3-1.26.11 werkzeug-2.2.1 zipp-3.8.1
安装Node Version Manager 和Node.js
AWS Cloud Development Kit (AWS CDK)模板生成会使用到Node Version Manager和Node.js。
$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.38.0/install.sh | bash
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 14926 100 14926 0 0 469k 0 --:--:-- --:--:-- --:--:-- 485k
=> Downloading nvm as script to '/home/ec2-user/.nvm'
=> Appending nvm source string to /home/ec2-user/.bashrc
=> Appending bash_completion source string to /home/ec2-user/.bashrc
=> Close and reopen your terminal to start using nvm or run the following to use it now:
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion" # This loads nvm bash_completion
$ chmod ug+x ~/.nvm/nvm.sh
$ source ~/.nvm/nvm.sh
$ nvm install --lts
Installing latest LTS version.
Downloading and installing node v16.16.0...
Downloading https://nodejs.org/dist/v16.16.0/node-v16.16.0-linux-x64.tar.xz...
################################################################################################################################################################################## 100.0%
Computing checksum with sha256sum
Checksums matched!
Now using node v16.16.0 (npm v8.11.0)
Creating default alias: default -> lts/* (-> v16.16.0)
$ node - version
验证AWS ParallelCluster安装正确
激活新的virtualenv
$ source ~/apc-ve/bin/activate
$ pcluster version
{
"version": "3.2.0"
}
配置AWS ParallelCluster
$ aws configure
AWS Access Key ID [None]: AKIA5OZOUQ4F2T4IMAOS
AWS Secret Access Key [None]: XXX
Default region name [None]: cn-northwest-1
Default output format [None]:
$ pcluster configure --config cluster-config.yaml
INFO: Configuration file cluster-config.yaml will be written.
Press CTRL-C to interrupt the procedure.
Allowed values for AWS Region ID:
1. cn-north-1
2. cn-northwest-1
AWS Region ID [cn-northwest-1]:
Allowed values for EC2 Key Pair Name:
1. LL-K2
EC2 Key Pair Name [LL-K2]:
Allowed values for Scheduler:
1. slurm
2. awsbatch
Scheduler [slurm]:
Allowed values for Operating System:
1. alinux2
2. centos7
3. ubuntu1804
4. ubuntu2004
Operating System [alinux2]: alinux2
Head node instance type [t2.micro]: c5.large
Number of queues [1]:
Name of queue 1 [queue1]:
Number of compute resources for queue1 [1]:
Compute instance type for compute resource 1 in queue1 [t2.micro]: c5.xlarge
Maximum instance count [10]:
Automate VPC creation? (y/n) [n]:
Allowed values for VPC ID:
# id name number_of_subnets
--- --------------------- ----------------- -------------------
1 vpc-003630feddf7d2417 EKS 2
2 vpc-013d1e62cfa405b8e ECS 2
3 vpc-0252e11202ae27e51 2
4 vpc-9b64d8f2 HPC 3
VPC ID [vpc-003630feddf7d2417]: vpc-9b64d8f2
Automate Subnet creation? (y/n) [y]:
Allowed values for Availability Zone:
1. cn-northwest-1a
2. cn-northwest-1b
3. cn-northwest-1c
Availability Zone [cn-northwest-1a]:
Allowed values for Network Configuration:
1. Head node in a public subnet and compute fleet in a private subnet
2. Head node and compute fleet in the same public subnet
Network Configuration [Head node in a public subnet and compute fleet in a private subnet]:
Creating CloudFormation stack...
Do not leave the terminal until the process has finished.
Stack Name: parallelclusternetworking-pubpriv-20220729030718 (id: arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/parallelclusternetworking-pubpriv-20220729030718/b846e230-0eeb-11ed-979c-0a9d1a8a4fe6)
Status: parallelclusternetworking-pubpriv-20220729030718 - CREATE_COMPLETE
The stack has been created.
Configuration file written to cluster-config.yaml
You can edit your configuration file or simply run 'pcluster create-cluster --cluster-configuration cluster-config.yaml --cluster-name cluster-name --region cn-northwest-1' to create your cluster.
创建CFD集群
配置文件
按照HPC/CFD运行需要修改cluster-config.yaml,增加前后处理所需的DCV远程可视化,还有流体计算所需的高性能计算文件系统Fsx Lustre。
1、NICE DCV
2、Fsx Lustre
SharedStorage:
- MountDir: /fsx
Name: ParallelFileSystem
StorageType: FsxLustre
FsxLustreSettings:
StorageCapacity: 1200
DeploymentType: PERSISTENT_1
ImportedFileChunkSize: 1024
ExportPath: s3://plljdi-fs1/export
ImportPath: s3://plljdi-fs1
PerUnitStorageThroughput: 200
当前ANSYS Fluent支持Centos 7操作系统,Amazon Linux 2不在ANSYS官方认证的系统里面。
创建集群
$ pcluster create-cluster --cluster-name cfd-cluster --cluster-configuration cfd-cluster-config.yaml
{
"cluster": {
"clusterName": "cfd-cluster",
"cloudformationStackStatus": "CREATE_IN_PROGRESS",
"cloudformationStackArn": "arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/test-cluster/348e1c40-0eed-11ed-b3f5-0a96b85a5424",
"region": "cn-northwest-1",
"version": "3.1.4",
"clusterStatus": "CREATE_IN_PROGRESS"
}
}
查询集群信息
$ pcluster describe-cluster --cluster-name cfd-cluster
{
"creationTime": "2022-07-29T10:31:33.608Z",
"headNode": {
"launchTime": "2022-07-29T10:40:14.000Z",
"instanceId": "i-0e3c4967953c806a7",
"publicIpAddress": "52.83.49.88",
"instanceType": "c5.large",
"state": "running",
"privateIpAddress": "172.31.48.96"
},
"version": "3.1.4",
"clusterConfiguration": {
"url": "https://parallelcluster-02fb13f6f8ec970c-v1-do-not-delete---s3---cn-northwest-1.amazonaws.com.rproxy.goskope.com.cn/parallelcluster/3.1.4/clusters/cfd-cluster-7p51jnbemquummo3/configs/cluster-config.yaml?versionId=sf6OxDbpIGYPjmrRfSSArCU5YRUHzCqo&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5OZOUQ4F2T4IMAOS%2F20220805%2Fcn-northwest-1%2Fs3%2Faws4_request&X-Amz-Date=20220805T021305Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=f7bd1e1e31bdcc9d3bf7b260d68f418e39a7239fdf4baf0983cb1e399cdea35e"
},
"tags": [
{
"value": "3.1.4",
"key": "parallelcluster:version"
}
],
"cloudFormationStackStatus": "CREATE_COMPLETE",
"clusterName": "cfd-cluster",
"computeFleetStatus": "RUNNING",
"cloudformationStackArn": "arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/9dc591c0-0f29-11ed-a5cd-02357b891a1c",
"lastUpdatedTime": "2022-07-29T10:31:33.608Z",
"region": "cn-northwest-1",
"clusterStatus": "CREATE_COMPLETE"
}
$ pcluster list-clusters --query 'clusters[?clusterName==`cfd-cluster`]'
[
{
"clusterName": "cfd-cluster",
"cloudformationStackStatus": "CREATE_IN_PROGRESS",
"cloudformationStackArn": "arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/f7316cd0-1464-11ed-8f62-0aa55a928096",
"region": "cn-northwest-1",
"version": "3.1.4",
"clusterStatus": "CREATE_IN_PROGRESS"
}
]
登陆集群
$ pcluster ssh --cluster-name cfd-cluster -i ~/LL-K2.pem
检查Slurm集群状态
sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
queue1* up infinite 10 idle~ queue1-dy-c5xlarge-[1-10]
sinfo -l
Fri Aug 05 02:56:34 2022
PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST
queue1* up infinite 1-infinite no NO all 10 idle~ queue1-dy-c5xlarge-[1-10]
squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
srun -n4 -l hostname
0: queue1-dy-c5xlarge-1
2: queue1-dy-c5xlarge-1
3: queue1-dy-c5xlarge-1
1: queue1-dy-c5xlarge-1
DCV登陆
DCV dcv-connect参数
pcluster dcv-connect [-h]
--cluster-name CLUSTER_NAME
[--debug]
[--key-path KEY_PATH]
[--region REGION]
[--show-url]
$ pcluster dcv-connect --cluster-name cfd-cluster --key-path ~/LL-K2.pem --show-url
Please use the following one-time URL in your browser within 30 seconds:
https://52.83.49.88:8443?authToken=Xh92zh9pJ3bWK1Sn_2gzdUEnf4GwjYWYyMmh2bWSq4n8Pm4jUWWbqCOuBG6CdWBLFpPwZLmi7WC8PM7t44DWwL9Lr85Cu_QWTaEg-A9tywg3TjA2waXRzQhhI8-URnDWfTpC8l6Od5IkaUyiAjqybRfK2a41yYHNYSYUc3uWL_UNKYgjjoqCjvwFyBpKa0WGo88mODGpLkyWNhU6dqiWTK-BMqbSXl3SttPQOgge6YIwvSyKB28rmP0JoyC4SkvN#8DWPj4h0HXiKPbh1yZ69
打开浏览器,通过链接登陆集群管理节点。CFD前后处理阶段可以通过DCV登陆在管理节点进行,可以根据CFD前后处理资源需求,配置带有GPU的机器。
安装Fluent软件
从ANSYS官方拿到安装介质和授权文件,通过DCV登陆到管理节点,将软件安装到共享存储Fsx Lustre目录下,这样所有的计算节点都能运行Fluent相关组件。按照安装提示往下走。
安装好之后,配置License访问端口。修改 ansyslmd.ini 文件,将以下两条记录添加进去。
SERVER=1055@licenseServer
ANSYSLI_SERVERS=2325@licenseServer
运行Fluent和CFD-Post软件
运行Fluent
通过 NICE DCV 登陆,然后运行 /fsx/apps/ansys_inc/v195/fluent/bin/fluent
用户可以通过 Fluent 来进行 CFD 的仿真模拟,因为当前 Fluent GUI 还不支持 Slurm 调度,可以通过脚本集成的方式,把 Fluent 作业提交给 Slurm sbatch。
运行CFD-Post
在Amazon Linux 2 下,需要正确设置 LD_LIBRARY_PATH 环境变量,因为可能会存在一些lib库,运行环境需要指定的。
export LD_LIBRARY_PATH=/fsx/apps/ansys_inc/v195/commonfiles/CFX/support/fluentio/lib/linx64/:$LD_LIBRARY_PATH
运行 /fsx/apps/ansys_inc/v195/CFD-Post/bin/cfdpost,通过 CFD-Post 查看模型仿真计算结果。例如 perf_IndyCar.res 结果文件。
资源回收
当我们不在需要计算环境的情况下,需要删除 CFD 集群。
pcluster delete-cluster --region cn-northwest-1 --cluster-name cfd-cluster
通过AWS Console,删除CloudFormation networking stack
删除VPC,如果是新建的 VPC。
本篇作者