计算化学公社

标题: 求助,CP2K提交任务一直出现这个问题导致任务自动终止 [打印本页]

作者
Author:
啊嘞嘞    时间: 2025-3-10 11:12
标题: 求助,CP2K提交任务一直出现这个问题导致任务自动终止


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:


Program received signal SIGILL: Illegal instruction.


Backtrace for this error:
#0  0x2b498e91b26f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#0  0x2b05c9d5226f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#0  0x2ba5f774626f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#0  0x2b17de44226f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#0  0x2b7263b1226f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#0  0x2b617d67726f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#0  0x2b5a1b8da26f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
#0  0x2ad12569926f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#0  0x2b2e85f4226f in ???
#9  0x5abe12 in cp2k_run
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:314
#10  0x5b0939 in __cp2k_runs_MOD_run_input
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k_runs.F:983
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#11  0x591494 in cp2k
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:379
#12  0x55e78c in main
        at /data/home/MHX/soft/cp2k-2024.1/src/start/cp2k.F:44
#0  0x2b7c36fb926f in ???
#1  0x7220192 in dgemm_
        at /data/home/MHX/soft/cp2k-2024.1/tools/toolchain/build/OpenBLAS-0.3.25/interface/gemm.c:241
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#2  0x2feb8e4 in __basis_set_types_MOD_init_cphi_and_sphi
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:835
#3  0x2fec2ef in __basis_set_types_MOD_init_orb_basis_set
        at /data/home/MHX/soft/cp2k-2024.1/src/aobasis/basis_set_types.F:1087
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
#4  0x13d9af4 in init_qs_kind
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1253
#5  0x13d9af4 in __qs_kind_types_MOD_init_qs_kind_set
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_kind_types.F:1284
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
#6  0x1c9cd86 in qs_init_subsys
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:996
#7  0x1ca36ab in __qs_environment_MOD_qs_init
        at /data/home/MHX/soft/cp2k-2024.1/src/qs_environment.F:365
#8  0x1040ecd in __f77_interface_MOD_create_force_env
        at /data/home/MHX/soft/cp2k-2024.1/src/f77_interface.F:805
--------------------------------------------------------------------------
mpirun noticed that process rank 9 with PID 5510 on node wtnode6 exited on signal 4 (Illegal instruction).
--------------------------------------------------------------------------
(, 下载次数 Times of downloads: 3)




作者
Author:
abin    时间: 2025-3-10 12:35
Illegal instruction.

请重新编译呀.

如果登录节点CPU指令集和计算节点CPU架构/指令集不同,
就会出现这个问题.

要么掌握交叉编译, 要么去计算节点编译.

完毕.
作者
Author:
啊嘞嘞    时间: 2025-3-10 12:53
本帖最后由 啊嘞嘞 于 2025-3-10 12:58 编辑
abin 发表于 2025-3-10 12:35
Illegal instruction.

请重新编译呀.

您好,我有个疑问,为啥刚安装好的时候可以正常运行,用一段时间过后才出现这个问题呢
作者
Author:
abin    时间: 2025-3-10 16:29
啊嘞嘞 发表于 2025-3-10 12:53
您好,我有个疑问,为啥刚安装好的时候可以正常运行,用一段时间过后才出现这个问题呢

或许你将上下粘贴给任何一个AI, 大概能得到如下的结论.

集群可能存在多个CPU规格.

用户在登录节点编译程序(大多数情况),
调度器分派到节点A运行, A和编译节点CPU, 支持同样的CPU指令集, 故此可以运行.

某个时刻, 又有计算, 派送到C机器, C和A以及编译节点,CPU规格或者支持的指令集不同, 当然无法运行.

请仔细看, cp2k toolchain, 默认采用类似xHost,
对于不同的处理器, 当然无法运行.


作者
Author:
啊嘞嘞    时间: 2025-3-10 21:36
abin 发表于 2025-3-10 16:29
或许你将上下粘贴给任何一个AI, 大概能得到如下的结论.

集群可能存在多个CPU规格.

这问题只能重新编译么
作者
Author:
logzzz    时间: 2025-3-10 22:18
你先确定是所有任务还是某一类的,我之前有遇到算TDDFT会报这个错,然后编译的时候用的mpich替代openmpi,就可以正常计算了。
作者
Author:
啊嘞嘞    时间: 2025-3-10 22:43
logzzz 发表于 2025-3-10 22:18
你先确定是所有任务还是某一类的,我之前有遇到算TDDFT会报这个错,然后编译的时候用的mpich替代openmpi, ...

算cp2k的任务会这样,算高斯的任务正常
作者
Author:
abin    时间: 2025-3-11 13:47
高斯公开的版本有,
SSE4.2
AVX
AVX2(不记得有没有).

你回头查查看, 你能找到的处理器都支持这几个指令集吧?
比如古老的Zen2架构, 也支持AVX2吧?

CP2K, 默认的编译方式是native, 也就是xHost,
这种方式, 如果编译机器和计算机器, 处理器规格不同, 大概率死翘翘的.


作者
Author:
Santz    时间: 2025-3-11 14:00
abin 发表于 2025-3-11 13:47
高斯公开的版本有,
SSE4.2
AVX

大概率很多是登录节点 AMD,计算节点 INTEL,然后直接在登录节点上安装软件了,GG
作者
Author:
啊嘞嘞    时间: 2025-3-11 17:15
Santz 发表于 2025-3-11 14:00
大概率很多是登录节点 AMD,计算节点 INTEL,然后直接在登录节点上安装软件了,GG

处理器应该是支持这些指令集的,但是您们说的编译机器和计算机器的处理器规格,登录节点和计算节点这些我没听明白。我是装在自己账号下,不是根目录。cp2k刚开始时可以运行,现在一运行就是上面的报错
作者
Author:
abin    时间: 2025-3-11 20:11
啊嘞嘞 发表于 2025-3-11 17:15
处理器应该是支持这些指令集的,但是您们说的编译机器和计算机器的处理器规格,登录节点和计算节点这些我 ...

"编译机器和计算机器的处理器规格,登录节点和计算节点这些我没听明白。我是装在自己账号下,不是根目录".
你要坚持这种前后逻辑关系,
神仙也救不了的.
作者
Author:
啊嘞嘞    时间: 2025-3-11 21:50
abin 发表于 2025-3-11 20:11
"编译机器和计算机器的处理器规格,登录节点和计算节点这些我没听明白。我是装在自己账号下,不是根目录" ...

不是坚持,是我越听越迷糊了。我想问的是怎么区别计算节点和登录节点和怎样才算是把软件装在计算节点上来解决这个问题
作者
Author:
abin    时间: 2025-3-13 14:06
啊嘞嘞 发表于 2025-3-11 21:50
不是坚持,是我越听越迷糊了。我想问的是怎么区别计算节点和登录节点和怎样才算是把软件装在计算节点上来 ...

这里有关于通用集群的架构, https://hpc4you.github.io
建议阅读.

你的问题:
怎么区别计算节点和登录节点
怎样才算是把软件装在计算节点上

体现出, 你关于集群的概念和理解, 基本是错误的.
作者
Author:
啊嘞嘞    时间: 2025-3-14 09:11
abin 发表于 2025-3-13 14:06
这里有关于通用集群的架构, https://hpc4you.github.io
建议阅读.

好的,非常感谢




欢迎光临 计算化学公社 (http://ccc.keinsci.com/) Powered by Discuz! X3.3