Skip to content

Conversation

huhaohao525
Copy link
Contributor

PR types

New Features

PR changes

Others

Describe

Add chem example Generative AI for designing and validating easily synthesizable and structurally novel antibiotics

Copy link

paddle-bot bot commented Aug 19, 2025

Thanks for your contribution!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一些注释内容可以删除一下,比如# import paddle_aux

Comment on lines 45 to 52
"""
:param args: A :class:`~chemprop.args.TrainArgs` object containing model arguments.
:param atom_fdim: Atom feature vector dimension.
:param bond_fdim: Bond feature vector dimension.
:param hidden_size: Hidden layers dimension
:param bias: Whether to add bias to linear layers
:param depth: Number of message passing steps
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init__的docstring请移动到类的下方

:param mol_graph: A :class:`~chemprop.features.featurization.BatchMolGraph` representing
a batch of molecular graphs.
:param atom_descriptors_batch: A list of numpy arrays containing additional atomic descriptors
:return: A PyTorch tensor of shape :code:`(num_molecules, hidden_size)` containing the encoding of each molecule.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类似的“PyTorch”字眼请替换成Paddle,整个PR都检查一遍

Comment on lines 122 to 124
# print(self.device)
# f_atoms, f_bonds, a2b, b2a, b2revb, a_scope, b_scope = (mol_graph.
# get_components(atom_messages=self.atom_messages))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些是调试代码吗?如果不用的话可以删除

Comment on lines 139 to 140
# print("message", message)
# print("b2revb", b2revb)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,是否可以删除

Comment on lines 199 to 203
"""
:param args: A :class:`~chemprop.args.TrainArgs` object containing model arguments.
:param atom_fdim: Atom feature vector dimension.
:param bond_fdim: Bond feature vector dimension.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init__的docstring请移动一下

:param atom_descriptors_batch: A list of numpy arrays containing additional atom descriptors.
:param atom_features_batch: A list of numpy arrays containing additional atom features.
:param bond_features_batch: A list of numpy arrays containing additional bond features.
:return: A PyTorch tensor of shape :code:`(num_molecules, hidden_size)` containing the encoding of each molecule.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

问题同上


# import paddle_aux
import paddle
from rdkit import Chem
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rdkit只是这个案例的依赖项,而不是PaddleScience的依赖,请参考如下代码进行处理,否则其它的案例都会无法运行

try:
import pgl
except ModuleNotFoundError:
pass

@@ -0,0 +1,163 @@
from typing import Callable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from typing import Callable
from __future__ import annotations
from typing import Callable

Comment on lines 35 to 38
from examples.synthemol.features import atom_features_zeros
from examples.synthemol.features import get_bond_fdim
from examples.synthemol.features import get_features_generator
from examples.synthemol.features import map_reac_to_prod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从正常代码设计逻辑来讲,模块代码不应该依赖案例代码,即ppsci/下的内容不应该依赖examples/下的代码,建议把这四个函数迁移到ppsci/data/dataset/synthemol_dataset.py下,然后examples的代码要使用时,从synthemol_dataset导入这几个函数


# Copyright 2024 Kyle Swanson

import threading
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import threading
from __future__ import annotations
import threading

)


class MoleculeDatasetIter(io.IterableDataset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请添加到dataset.md中

module.train()


class MoleculeModel(paddle.nn.Layer):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请将MoleculeModel添加到arch.md和ppsci/arch/__init__.py

self.ADDING_H = False


PARAMS = Featurization_parameters()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是否能延迟初始化?否则一旦在任何位置导入这个文件,会报错找不到Chem

@HydrogenSulfate
Copy link
Collaborator

@huhaohao525 后续的PR可以关注一下
image
,里面如果出现报错信息,仍然需要解决一下

@huhaohao525
Copy link
Contributor Author

@huhaohao525 后续的PR可以关注一下 image ,里面如果出现报错信息,仍然需要解决一下

为啥我加了 from future import annotations ,在CL的时候还是会看到,找不到Chem未定义的NameError呢

Comment on lines 1028 to 1029
Molecule = Union[str, Chem.Mol]
FeaturesGenerator = Callable[[Molecule], np.ndarray]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为你这里进行了实际的类型赋值操作,所以annotations是没用的,给类加上引号可以解决这个问题

Suggested change
Molecule = Union[str, Chem.Mol]
FeaturesGenerator = Callable[[Molecule], np.ndarray]
Molecule = Union[str, "Chem.Mol"]
FeaturesGenerator = Callable[[Molecule], np.ndarray]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的好的

@leeleolay
Copy link
Contributor

训练模型报错了
image

@leeleolay
Copy link
Contributor

md文档显示代码那里有点问题
systhmol模型的训练代码在哪里呢,只看到训练chemprop的

Copy link
Contributor

@leeleolay leeleolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可合入

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

数据相关文件请 @leeleolay 帮忙上传下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

两个文件已打包上传,文档已修改,多余的数据文件已从PR里删除了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants