Parsing SwissLipids into a network for LipiNet¶

import lipinet.databases  # Import the module
import importlib

# Reload the module to ensure changes are picked up
importlib.reload(lipinet)

# Now can use the functions after reloading the module
from lipinet.databases import get_prior_knowledge
from lipinet.utils import split_and_expand_large, create_nodedf_from_edgedf, check_for_split_characters

import pandas as pd

Parsing the manual way¶

LipiNet offers conventient functions to parse prior knowledge resources straight into networks. But to show what is happening behind the scenes, this notebook goes through the data and each of the steps. Which may also be particularly helpful to you if you need to customise the networks in a way that is not yet supported by LipiNet directly.

df_swisslipids = get_prior_knowledge('swisslipids', verbose=True)
df_swisslipids

File found locally at /Users/agjanyunlu/Documents/Metabolomics/lipinet/lipinet/.data/downloaded/swisslipids_lipids.tsv. Loading data...
Before cleaning, number of values in lipid class column with trailing space: Lipid class*
False    779171
True         76
Name: count, dtype: int64
After cleaning, number of values in lipid class column with trailing space: Lipid class*
False    779247
Name: count, dtype: int64

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
0	SLM:000000002	Class	Ceramide (iso-d17:1(4E))	Cer(iso-d17:1(4E))	N-acyl-15-methylhexadecasphing-4-enine	SLM:000399814	NaN	NaN	CC(C)CCCCCCCCC\C=C\[C@@H](O)[C@H](CO)NC([*])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	70846	NaN	NaN	MNXM97012	\| 11443131 \| 14685263 \| 18390550 \| 21325339 \|...
1	SLM:000000003	Isomeric subspecies	15-methylhexadecasphing-4-enine	NaN	NaN	SLM:000390097	NaN	NaN	CC(C)CCCCCCCCC\C=C\[C@@H](O)[C@@H]([NH3+])CO	InChI=1S/C17H35NO2/c1-15(2)12-10-8-6-4-3-5-7-9...	...	292.282235	303.300605	284.259503	320.236181	344.280632	70771	NaN	NaN	MNXM57784	19372430
2	SLM:000000006	Isomeric subspecies	15-methylhexadecasphinganine	NaN	NaN	SLM:000390097	NaN	NaN	CC(C)CCCCCCCCCCC[C@@H](O)[C@@H]([NH3+])CO	InChI=1S/C17H37NO2/c1-15(2)12-10-8-6-4-3-5-7-9...	...	294.297885	305.316255	286.275153	322.251831	346.296282	70829	NaN	NaN	MNXM97029	19372430
3	SLM:000000007	Class	Sphingomyelin (iso-d17:1(4E))	SM(iso-d17:1(4E))	N-acyl-15-methylhexadecasphing-4-enine-1-phosp...	SLM:000001000	NaN	NaN	CC(C)CCCCCCCCC\C=C\[C@@H](O)[C@H](COP([O-])(=O...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	70775	NaN	NaN	MNXM97113	14685263 \| 21926990 \| 9603947
4	SLM:000000035	Isomeric subspecies	sphinganine	NaN	NaN	SLM:000390097	NaN	NaN	CCCCCCCCCCCCCCC[C@@H](O)[C@@H]([NH3+])CO	InChI=1S/C18H39NO2/c1-2-3-4-5-6-7-8-9-10-11-12...	...	308.313535	319.331905	300.290803	336.267481	360.311932	57817	LMSP01020001	HMDB00269	MNXM302	10652340 \| 10702247 \| 10751414 \| 10802064 \| 10...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
779244	SLM:000782324	NaN	apo carotenoid	NaN	NaN	SLM:000508864	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	53183	NaN	NaN	NaN	NaN
779245	SLM:000782325	NaN	terpenoid	NaN	NaN	SLM:000508864	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	26873	NaN	NaN	NaN	NaN
779246	SLM:000782326	NaN	C-45 isoprenoid	NaN	NaN	SLM:000508864	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	87168	NaN	NaN	NaN	NaN
779247	SLM:000782327	NaN	gamma-lactone	NaN	NaN	SLM:000782238	NaN	NaN	O1C(C(C(C1=O)))*	NaN	...	NaN	NaN	NaN	NaN	NaN	37581	NaN	NaN	NaN	NaN
779248	SLM:000782328	NaN	oxidized 2-acylglycerol	NaN	NaN	SLM:000000355	NaN	NaN	OCC(CO)OC(=O)*	NaN	...	NaN	NaN	NaN	NaN	NaN	167117	NaN	NaN	NaN	NaN

779249 rows × 29 columns

If we take a closer look into the data, especially the Lipid class* column, we will see that some of the values have multiple entries. For example Ceramide phosphoinositol is a Class level entry that itself belongs to both the SLM:000000834 and SLM:000399815 classes.

df_swisslipids.dropna(subset='Lipid class*')[df_swisslipids['Lipid class*'].dropna().str.contains('|', regex=False)]

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
142	SLM:000000392	Class	Ceramide phosphoinositol	IPC	Inositol-1-phosphoceramide	SLM:000000834 \| SLM:000399815	NaN	NaN	O[C@H]([*])[C@H](COP([O-])(=O)O[C@H]1[C@H](O)[...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	64916	NaN	NaN	NaN	10888667 \| 20727985
234	SLM:000000509	Isomeric subspecies	All-trans-retinyl hexadecanoate	NaN	all-trans-retinyl palmitate	SLM:000000982 \| SLM:000508854	NaN	NaN	CCCCCCCCCCCCCCCC(=O)OCC=C(C)C=CC=C(C)C=CC1=C(C...	InChI=1S/C36H60O2/c1-7-8-9-10-11-12-13-14-15-1...	...	NaN	NaN	NaN	NaN	NaN	17616	NaN	HMDB03648	NaN	10769148 \| 10819989 \| 12230550 \| 15550674 \| 15...
315	SLM:000000612	NaN	tetracosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74146	NaN	NaN	NaN	18541923 \| 20110363 \| 20937905
317	SLM:000000614	NaN	hexacosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74161	NaN	NaN	NaN	18165233
319	SLM:000000621	NaN	2-hydroxy-tetracosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74215	NaN	NaN	NaN	18541923
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
755324	SLM:000758294	Class	Globoside	Globo	Globo-series	SLM:000000834 \| SLM:000399813	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	61360	NaN	NaN	NaN	NaN
755325	SLM:000758295	Class	Isogloboside	Isoglobo	Isoglobo-series	SLM:000000834 \| SLM:000399813	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	78257	NaN	NaN	NaN	NaN
779141	SLM:000782221	NaN	Resolvin E	RvE	NaN	SLM:000501332 \| SLM:000508853	NaN	NaN	NaN	InChI=none	...	NaN	NaN	NaN	NaN	NaN	NaN	LMFA0314	NaN	NaN	NaN
779142	SLM:000782222	NaN	Resolvin D	RvD	NaN	SLM:000501331 \| SLM:000508853	NaN	NaN	NaN	InChI=none	...	NaN	NaN	NaN	NaN	NaN	NaN	LMFA0403	NaN	NaN	NaN
779157	SLM:000782237	NaN	an N-(omega-(9Z,12Z-octadecadienoyloxy)-ultra-...	NaN	NaN	SLM:000000413 \| SLM:000782274	NaN	NaN	[C@H]([C@@H](/C=C/CCCCCCCCCCCCC)O)(NC(=O)*COC(...	NaN	...	NaN	NaN	NaN	NaN	NaN	157662	NaN	NaN	NaN	NaN

119 rows × 29 columns

What about other IDs?

cols_with_split_chars = check_for_split_characters(df_swisslipids, delimiter='|')

Checking split characters (|) in Lipid ID
No rows found

Checking split characters (|) in Level
No rows found

Checking split characters (|) in Name
No rows found

Checking split characters (|) in Abbreviation*
Found 9768 rows with split characters

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
56	SLM:000000262	Class	1,2-diacyl-sn-glycerol	1,2-sn-DAG \| DAG \| DG	Diacylglycerol	SLM:000000423	NaN	NaN	OC[C@@H](COC([])=O)OC([])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	17815	NaN	NaN	MNXM59	10336610 \| 10685032 \| 10888667 \| 10931938 \| 11...
114	SLM:000000341	Class	1-acyl-sn-glycerol	MAG \| MG	Monoacylglycerol	SLM:000117130	NaN	NaN	OC[C@H](O)COC([*])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	64683	NaN	NaN	MNXM2963	10685032 \| 15939762 \| 18037386 \| 8663293 \| 960...
122	SLM:000000355	Class	2-acylglycerol	MAG \| MG	Monoacylglycerol	SLM:000000403	NaN	NaN	OCC(CO)OC([*])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	17389	NaN	NaN	MNXM335	NaN
146	SLM:000000400	Class	Triacylglycerol	TAG \| TG	NaN	SLM:000117141	NaN	NaN	[]C(=O)OCC(COC([])=O)OC([*])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	17855	NaN	NaN	MNXM248	12682047 \| 16135509 \| 16150821 \| 21704635 \| 27...
147	SLM:000000401	Class	Diacylglycerol	DAG \| DG	NaN	SLM:000117140	NaN	NaN	[]OCC(CO[])O[*]	InChI=none	...	NaN	NaN	NaN	NaN	NaN	18035	NaN	NaN	MNXM59	12682047 \| 16135509 \| 16150821 \| 27247428 \| 29...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
505694	SLM:000508489	Molecular subspecies	Phosphatidylglycerol (O-17:1_0:0)	LPG(O-17:1_0:0) \| PG(O-17:1_0:0)	Lysophosphatidylglycerol (O-17:1_0:0)	SLM:000508807	SLM:000508779	SLM:000001333 (sn1 or sn2 or sn3)	OCC(O)COP([O-])(=O)OCC(CO[])O[]	InChI=none	...	489.316311	500.334681	481.293579	517.270257	541.314708	NaN	NaN	NaN	MNXM629334	NaN
505695	SLM:000508490	Molecular subspecies	Phosphatidylglycerol (O-15:1_0:0)	LPG(O-15:1_0:0) \| PG(O-15:1_0:0)	Lysophosphatidylglycerol (O-15:1_0:0)	SLM:000508807	SLM:000508775	SLM:000001331 (sn1 or sn2 or sn3)	OCC(O)COP([O-])(=O)OCC(CO[])O[]	InChI=none	...	461.285011	472.303381	453.262279	489.238957	513.283408	NaN	NaN	NaN	MNXM628940	NaN
505696	SLM:000508491	Molecular subspecies	Phosphatidylglycerol (O-13:1_0:0)	LPG(O-13:1_0:0) \| PG(O-13:1_0:0)	Lysophosphatidylglycerol (O-13:1_0:0)	SLM:000508807	SLM:000508771	SLM:000001329 (sn1 or sn2 or sn3)	OCC(O)COP([O-])(=O)OCC(CO[])O[]	InChI=none	...	433.253711	444.272081	425.230979	461.207657	485.252108	NaN	NaN	NaN	MNXM628548	NaN
595061	SLM:000597889	Isomeric subspecies	7-oxoresolvin D2	7-oxo-RvD2\| 7-keto-RvD2	(16R,17S)-dihydroxy-7-oxo-(4Z,8E,10Z,12E,14E,1...	SLM:000508853 \| SLM:000782222	NaN	NaN	C(C/C=C\CC(/C=C/C=C\C=C\C=C\[C@H]([C@H](C/C=C\...	InChI=1S/C22H30O5/c1-2-3-9-16-20(24)21(25)17-1...	...	381.224780	392.243150	373.202048	409.178725	433.223177	137497	NaN	NaN	NaN	22844113
595062	SLM:000597890	Isomeric subspecies	16-oxoresolvin D2	16-oxo-RvD2\| 16-keto-RvD2	(7S,17S)-dihydroxy-16-oxo-(4Z,8E,10Z,12E,14E,1...	SLM:000508853 \| SLM:000782222	NaN	NaN	C(C/C=C\C[C@@H](\C=C\C=C/C=C/C=C/C([C@H](C/C=C...	InChI=1S/C22H30O5/c1-2-3-9-16-20(24)21(25)17-1...	...	381.224780	392.243150	373.202048	409.178725	433.223177	137498	NaN	NaN	NaN	22844113

9768 rows × 29 columns

Checking split characters (|) in Synonyms*
Found 19853 rows with split characters

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
11	SLM:000000101	Class	1,2-diacyl-sn-glycero-3-phospho-(1'-sn-glycero...	PA	1,2-diacyl-sn-glycero-3-phospho-(1'-sn-glycero...	SLM:000477285	NaN	NaN	O[C@@H](COP([O-])([O-])=O)COP([O-])(=O)OC[C@@H...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	60110	NaN	NaN	MNXM871	20485265 \| 9880566
17	SLM:000000147	Isomeric subspecies	N-(9Z-octadecenoyl)-ethanolamine	NAE (18:1(9Z))	(9Z-octadecenoyl)-ethanolamide \| N-(9Z-octadec...	SLM:000000378	NaN	NaN	CCCCCCCC\C=C/CCCCCCCC(=O)NCCO	InChI=1S/C20H39NO2/c1-2-3-4-5-6-7-8-9-10-11-12...	...	332.313535	343.331905	324.290803	360.267481	384.311932	71466	NaN	HMDB02088	MNXM107386	14634025 \| 16527816 \| 17015445 \| 17626977 \| 17...
18	SLM:000000149	Isomeric subspecies	N-hexadecanoyl-ethanolamine	NAE (16:0)	hexadecanoyl-ethanolamide \| N-hexadecanoyl eth...	SLM:000000378	NaN	NaN	CCCCCCCCCCCCCCCC(=O)NCCO	InChI=1S/C18H37NO2/c1-2-3-4-5-6-7-8-9-10-11-12...	...	306.297885	317.316255	298.275153	334.251831	358.296282	71464	NaN	HMDB02100	MNXM107548	12824167 \| 14634025 \| 15655246 \| 15760304 \| 16...
19	SLM:000000178	Isomeric subspecies	N-(docosanoyl)-15-methylhexadecasphing-4-enine	Cer(iso-d17:1(4E)/22:0)	Ceramide (iso-d17:1(4E)/22:0) \| N-docosanoyl-1...	SLM:000000002	SLM:000392021	SLM:000000827 (n-acyl)	CCCCCCCCCCCCCCCCCCCCCC(=O)N[C@@H](CO)[C@H](O)\...	InChI=1S/C39H77NO3/c1-4-5-6-7-8-9-10-11-12-13-...	...	614.605801	625.624171	606.583069	642.559747	666.604198	71377	NaN	NaN	MNXM107026	19372430
20	SLM:000000179	Isomeric subspecies	N-(heneicosanoyl)-15-methylhexadecasphing-4-enine	Cer(iso-d17:1(4E)/21:0)	Ceramide (iso-d17:1(4E)/21:0) \| N-henicosanoyl...	SLM:000000002	SLM:000392020	SLM:000001207 (n-acyl)	CCCCCCCCCCCCCCCCCCCCC(=O)N[C@@H](CO)[C@H](O)\C...	InChI=1S/C38H75NO3/c1-4-5-6-7-8-9-10-11-12-13-...	...	600.590151	611.608521	592.567419	628.544097	652.588548	71375	NaN	NaN	MNXM107036	19372430
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
745092	SLM:000747954	Isomeric subspecies	CDP-1,2-di-(13-methyltetradecanoyl)-sn-glycerol	CDP-DAG (iso15:0/iso15:0)	1,2-di-(13-methyltetradecanoyl)-sn-glycero-3-c...	SLM:000000084	NaN	SLM:000000047 (sn1 or sn2)	[H]Nc1ccn([C@@H]2O[C@H](COP([O-])(=O)OP([O-])(...	InChI=1S/C42H77N3O15P2/c1-32(2)23-19-15-11-7-5...	...	932.498448	943.516818	924.475716	960.452394	984.496846	NaN	NaN	HMDB0116214	NaN	NaN
745093	SLM:000747955	Isomeric subspecies	CDP-1-(13-methyltetradecanoyl)-2-(15-methylhex...	CDP-DAG (iso15:0/iso17:0)	1-(13-methyltetradecanoyl)-2-(15-methylhexadec...	SLM:000000084	NaN	SLM:000000047 (sn1) / SLM:000000048 (sn2)	[H]Nc1ccn([C@@H]2O[C@H](COP([O-])(=O)OP([O-])(...	InChI=1S/C44H81N3O15P2/c1-34(2)25-21-17-13-9-6...	...	960.529748	971.548118	952.507016	988.483694	1012.528146	NaN	NaN	HMDB0116216	NaN	NaN
745175	SLM:000748037	Isomeric subspecies	CDP-1-(15-methylhexadecanoyl)-2-(11-methyldode...	CDP-DAG (iso17:0/iso13:0)	1-(15-methylhexadecanoyl)-2-(11-methyldodecano...	SLM:000000084	NaN	SLM:000000048 (sn1) / SLM:000001197 (sn2)	[H]Nc1ccn([C@@H]2O[C@H](COP([O-])(=O)OP([O-])(...	InChI=1S/C42H77N3O15P2/c1-32(2)23-19-15-11-8-6...	...	932.498448	943.516818	924.475716	960.452394	984.496846	NaN	NaN	HMDB0116248	NaN	NaN
745176	SLM:000748038	Isomeric subspecies	CDP-1-(15-methylhexadecanoyl)-2-(13-methyltetr...	CDP-DAG (iso17:0/iso15:0)	1-(15-methylhexadecanoyl)-2-(13-methyltetradec...	SLM:000000084	NaN	SLM:000000047 (sn2) / SLM:000000048 (sn1)	[H]Nc1ccn([C@@H]2O[C@H](COP([O-])(=O)OP([O-])(...	InChI=1S/C44H81N3O15P2/c1-34(2)25-21-17-13-9-6...	...	960.529748	971.548118	952.507016	988.483694	1012.528146	NaN	NaN	HMDB0116250	NaN	NaN
745177	SLM:000748039	Isomeric subspecies	CDP-1,2-di-(15-methylhexadecanoyl)-sn-glycerol	CDP-DAG (iso17:0/iso17:0)	1,2-di-(15-methylhexadecanoyl)-sn-glycero-3-cy...	SLM:000000084	NaN	SLM:000000048 (sn1 or sn2)	[H]Nc1ccn([C@@H]2O[C@H](COP([O-])(=O)OP([O-])(...	InChI=1S/C46H85N3O15P2/c1-36(2)27-23-19-15-11-...	...	988.561049	999.579419	980.538317	1016.514994	1040.559446	NaN	NaN	HMDB0116252	NaN	NaN

19853 rows × 29 columns

Checking split characters (|) in Lipid class*
Found 119 rows with split characters

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
142	SLM:000000392	Class	Ceramide phosphoinositol	IPC	Inositol-1-phosphoceramide	SLM:000000834 \| SLM:000399815	NaN	NaN	O[C@H]([*])[C@H](COP([O-])(=O)O[C@H]1[C@H](O)[...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	64916	NaN	NaN	NaN	10888667 \| 20727985
234	SLM:000000509	Isomeric subspecies	All-trans-retinyl hexadecanoate	NaN	all-trans-retinyl palmitate	SLM:000000982 \| SLM:000508854	NaN	NaN	CCCCCCCCCCCCCCCC(=O)OCC=C(C)C=CC=C(C)C=CC1=C(C...	InChI=1S/C36H60O2/c1-7-8-9-10-11-12-13-14-15-1...	...	NaN	NaN	NaN	NaN	NaN	17616	NaN	HMDB03648	NaN	10769148 \| 10819989 \| 12230550 \| 15550674 \| 15...
315	SLM:000000612	NaN	tetracosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74146	NaN	NaN	NaN	18541923 \| 20110363 \| 20937905
317	SLM:000000614	NaN	hexacosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74161	NaN	NaN	NaN	18165233
319	SLM:000000621	NaN	2-hydroxy-tetracosenoyl-CoA	NaN	NaN	SLM:000390051 \| SLM:000782334	NaN	NaN	CC(C)(COP([O-])(=O)OP([O-])(=O)OC[C@H]1O[C@H](...	NaN	...	NaN	NaN	NaN	NaN	NaN	74215	NaN	NaN	NaN	18541923
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
755324	SLM:000758294	Class	Globoside	Globo	Globo-series	SLM:000000834 \| SLM:000399813	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	61360	NaN	NaN	NaN	NaN
755325	SLM:000758295	Class	Isogloboside	Isoglobo	Isoglobo-series	SLM:000000834 \| SLM:000399813	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	78257	NaN	NaN	NaN	NaN
779141	SLM:000782221	NaN	Resolvin E	RvE	NaN	SLM:000501332 \| SLM:000508853	NaN	NaN	NaN	InChI=none	...	NaN	NaN	NaN	NaN	NaN	NaN	LMFA0314	NaN	NaN	NaN
779142	SLM:000782222	NaN	Resolvin D	RvD	NaN	SLM:000501331 \| SLM:000508853	NaN	NaN	NaN	InChI=none	...	NaN	NaN	NaN	NaN	NaN	NaN	LMFA0403	NaN	NaN	NaN
779157	SLM:000782237	NaN	an N-(omega-(9Z,12Z-octadecadienoyloxy)-ultra-...	NaN	NaN	SLM:000000413 \| SLM:000782274	NaN	NaN	[C@H]([C@@H](/C=C/CCCCCCCCCCCCC)O)(NC(=O)*COC(...	NaN	...	NaN	NaN	NaN	NaN	NaN	157662	NaN	NaN	NaN	NaN

119 rows × 29 columns

Checking split characters (|) in Parent
No rows found

Checking split characters (|) in Components*
No rows found

Checking split characters (|) in SMILES (pH7.3)
No rows found

Checking split characters (|) in InChI (pH7.3)
No rows found

Checking split characters (|) in InChI key (pH7.3)
No rows found

Checking split characters (|) in Formula (pH7.3)
No rows found

Checking split characters (|) in Charge (pH7.3)
Not a string column

Checking split characters (|) in Mass (pH7.3)
Not a string column

Checking split characters (|) in Exact Mass (neutral form)
Not a string column

Checking split characters (|) in Exact m/z of [M.]+
Not a string column

Checking split characters (|) in Exact m/z of [M+H]+
Not a string column

Checking split characters (|) in Exact m/z of [M+K]+ 
Not a string column

Checking split characters (|) in Exact m/z of [M+Na]+
Not a string column

Checking split characters (|) in Exact m/z of [M+Li]+
Not a string column

Checking split characters (|) in Exact m/z of [M+NH4]+
Not a string column

Checking split characters (|) in Exact m/z of [M-H]-
Not a string column

Checking split characters (|) in Exact m/z of [M+Cl]-
Not a string column

Checking split characters (|) in Exact m/z of [M+OAc]- 
Not a string column

Checking split characters (|) in CHEBI
Found 3 rows with split characters

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
465	SLM:000000784	Isomeric subspecies	1,2-di-(9Z-octadecenoyl)-sn-glycero-3-phosphate	PA(18:1(9Z)/18:1(9Z))	Phosphatidate (18:1(9Z)/18:1(9Z))	SLM:000000329	SLM:000082169	SLM:000000418 (sn1 or sn2)	CCCCCCCC\C=C/CCCCCCCC(=O)OC[C@H](COP([O-])([O-...	InChI=1S/C39H73O8P/c1-3-5-7-9-11-13-15-17-19-2...	...	707.519775	718.538147	699.497009	735.473694	759.518188	74546 \| 82922	LMGP10010962	HMDB07865	MNXM51075	11309392 \| 14634025 \| 14665624 \| 15164764 \| 15...
387185	SLM:000389154	NaN	(14Z,17Z,20Z,23Z,26Z)-dotriacontapentaenoate	NaN	Fatty acid 32:5(14Z,17Z,20Z,23Z,26Z)	SLM:000389801	NaN	NaN	CCCCC\C=C/C\C=C/C\C=C/C\C=C/C\C=C/CCCCCCCCCCCC...	InChI=1S/C32H54O2/c1-2-3-4-5-6-7-8-9-10-11-12-...	...	477.427836	488.446207	469.405105	505.381782	529.426234	82731 \| CHEBI:82731	LMFA01030848	NaN	NaN	NaN
595221	SLM:000598072	NaN	all-trans-retinol--[retinol-binding protein]	NaN	NaN	SLM:000000982	NaN	NaN	[][C@H](N-)C(-*)=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	17336 \| 83228	NaN	NaN	NaN	20628054 \| 28758396

3 rows × 29 columns

Checking split characters (|) in LIPID MAPS
No rows found

Checking split characters (|) in HMDB
No rows found

Checking split characters (|) in MetaNetX
No rows found

Checking split characters (|) in PMID
Found 1318 rows with split characters

	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	SMILES (pH7.3)	InChI (pH7.3)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
0	SLM:000000002	Class	Ceramide (iso-d17:1(4E))	Cer(iso-d17:1(4E))	N-acyl-15-methylhexadecasphing-4-enine	SLM:000399814	NaN	NaN	CC(C)CCCCCCCCC\C=C\[C@@H](O)[C@H](CO)NC([*])=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	70846	NaN	NaN	MNXM97012	\| 11443131 \| 14685263 \| 18390550 \| 21325339 \|...
3	SLM:000000007	Class	Sphingomyelin (iso-d17:1(4E))	SM(iso-d17:1(4E))	N-acyl-15-methylhexadecasphing-4-enine-1-phosp...	SLM:000001000	NaN	NaN	CC(C)CCCCCCCCC\C=C\[C@@H](O)[C@H](COP([O-])(=O...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	70775	NaN	NaN	MNXM97113	14685263 \| 21926990 \| 9603947
4	SLM:000000035	Isomeric subspecies	sphinganine	NaN	NaN	SLM:000390097	NaN	NaN	CCCCCCCCCCCCCCC[C@@H](O)[C@@H]([NH3+])CO	InChI=1S/C18H39NO2/c1-2-3-4-5-6-7-8-9-10-11-12...	...	308.313535	319.331905	300.290803	336.267481	360.311932	57817	LMSP01020001	HMDB00269	MNXM302	10652340 \| 10702247 \| 10751414 \| 10802064 \| 10...
5	SLM:000000042	Isomeric subspecies	cholesta-5,7-dien-3beta-ol	NaN	NaN	SLM:000501263	NaN	NaN	[H][C@@]1(CC[C@@]2([H])C3=CC=C4C[C@@H](O)CC[C@...	InChI=1S/C27H44O/c1-18(2)7-6-8-19(3)23-11-12-2...	...	391.354671	402.373042	383.331940	419.308617	443.353069	17759	LMST01010069	HMDB00032	MNXM710	10329655 \| 10344195 \| 10786622 \| 11230174 \| 16...
6	SLM:000000043	Isomeric subspecies	lathosterone	NaN	NaN	SLM:000501263	NaN	NaN	[H][C@@]12CC=C3[C@]4([H])CC[C@]([H])([C@H](C)C...	InChI=1S/C27H44O/c1-18(2)7-6-8-19(3)23-11-12-2...	...	391.354671	402.373042	383.331940	419.308617	443.353069	71550	NaN	NaN	MNXM97065	19531354 \| 22505847
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
595221	SLM:000598072	NaN	all-trans-retinol--[retinol-binding protein]	NaN	NaN	SLM:000000982	NaN	NaN	[][C@H](N-)C(-*)=O	InChI=none	...	NaN	NaN	NaN	NaN	NaN	17336 \| 83228	NaN	NaN	NaN	20628054 \| 28758396
595222	SLM:000598073	NaN	all-trans-retinyl heptanoate	NaN	NaN	SLM:000000982	NaN	NaN	C1(C)(C)C(\C=C\C(=C\C=C\C(=C\COC(CCCCCC)=O)\C)...	InChI=1S/C27H42O2/c1-7-8-9-10-16-26(28)29-21-1...	...	NaN	NaN	NaN	NaN	NaN	138724	NaN	NaN	NaN	20628054 \| 28758396
595223	SLM:000598074	NaN	2-heptanoyl-sn-glycero-3-phosphocholine	NaN	NaN	SLM:000000724	NaN	NaN	P(OC[C@@H](CO)OC(=O)CCCCCC)(=O)(OCC[N+](C)(C)C...	InChI=1S/C15H32NO7P/c1-5-6-7-8-9-15(18)23-14(1...	...	NaN	NaN	NaN	NaN	NaN	138266	NaN	NaN	NaN	20628054 \| 22605381 \| 28758396
595230	SLM:000598083	NaN	12-hydroxy-(9Z)-octadecenoyl-CoA	NaN	NaN	SLM:000389958 \| SLM:000390051	NaN	NaN	S(C(CCCCCCC/C=C\C[C@@H](CCCCCC)O)=O)CCNC(CCNC(...	InChI=1S/C39H68N7O18P3S/c1-4-5-6-13-16-27(47)1...	...	NaN	NaN	NaN	NaN	NaN	139559	NaN	NaN	NaN	17084870 \| 27758859
595245	SLM:000598101	NaN	a mannosylinositol-1-phospho-N-(2-hydroxyacyl)...	NaN	NaN	SLM:000000835	NaN	NaN	OC[C@H]1OC(O[C@@H]2[C@@H](O)[C@H](O)[C@@H](O)[...	InChI=none	...	NaN	NaN	NaN	NaN	NaN	74994	NaN	NaN	NaN	12954640 \| 9368028

1318 rows × 29 columns

Okay wow! So these are all the columns we have found with split characters…

cols_with_split_chars

['Abbreviation*', 'Synonyms*', 'Lipid class*', 'CHEBI', 'PMID']

We can also check for different types of characters if we know that they will be present. For instance SL uses the / character for Components*, but this is also used by another of columns like the lipid names themselves or smiles and inchi.

check_for_split_characters(df_swisslipids.drop(columns=['Name','Abbreviation*','Synonyms*','SMILES (pH7.3)','InChI (pH7.3)']), delimiter='/')

Checking split characters (/) in Lipid ID
No rows found

Checking split characters (/) in Level
No rows found

Checking split characters (/) in Lipid class*
No rows found

Checking split characters (/) in Parent
No rows found

Checking split characters (/) in Components*
Found 708725 rows with split characters

	Lipid ID	Level	Lipid class*	Parent	Components*	InChI key (pH7.3)	Formula (pH7.3)	Charge (pH7.3)	Mass (pH7.3)	Exact Mass (neutral form)	...	Exact m/z of [M+Li]+	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID
164	SLM:000000422	Isomeric subspecies	SLM:000000329	SLM:000081844	SLM:000000418 (sn2) / SLM:000000510 (sn1)	InChIKey=OPVZUEPSMJNLOM-QEJMHMKOSA-L	C37H69O8P	-2.0	672.913818	674.488647	...	681.504089	692.522461	673.481384	709.458069	733.502502	64839	LMGP10010032	HMDB07859	MNXM66476	10359651 \| 11788596 \| 12963729 \| 16620771 \| 17...
229	SLM:000000498	Isomeric subspecies	SLM:000000324	SLM:000105249	SLM:000000296 (sn2) / SLM:000000826 (sn1)	InChIKey=KRTOMQDUKGRFDJ-ZAHDIIMDSA-M	C47H82O13P	-1.0	886.120483	886.557129	...	893.572571	904.590942	885.549866	921.526550	945.570984	133606	LMGP06010010	HMDB09815	MNXM75683	22942276 \| 23097495 \| 23472195 \| 8300559
269	SLM:000000557	Isomeric subspecies	SLM:000000261	SLM:000088147	SLM:000000510 (sn1) / SLM:000000826 (sn2)	InChIKey=PZNPLUBHRSSFHT-RRHRGVEJSA-N	C42H84NO8P	0.0	762.091980	761.593445	...	768.608887	779.627258	NaN	796.562866	820.607300	73000	LMGP01010573	HMDB07970	MNXM69304	18195019 \| 19416660 \| 22923616 \| 27399000
332	SLM:000000636	Isomeric subspecies	SLM:000000329	SLM:000082164	SLM:000000418 (sn1) / SLM:000000510 (sn2)	InChIKey=ZSXHMDPHNCOWSV-QEJMHMKOSA-L	C37H69O8P	-2.0	672.913818	674.488647	...	681.504089	692.522461	673.481384	709.458069	733.502502	74551	LMGP10010964	NaN	MNXM66662	16620771 \| 18606822 \| 19318427 \| 19801371 \| 20...
333	SLM:000000637	Isomeric subspecies	SLM:000000329	SLM:000082168	SLM:000000418 (sn1) / SLM:000000826 (sn2)	InChIKey=XIERONXOJKEALF-PXYGFXEISA-L	C39H73O8P	-2.0	700.966980	702.519958	...	709.535400	720.553772	701.512695	737.489380	761.533813	74552	LMGP10010963	NaN	MNXM66667	16620771 \| 18606822 \| 19318427 \| 19801371 \| 21...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
745172	SLM:000748034	Isomeric subspecies	SLM:000000084	NaN	SLM:000000048 (sn1) / SLM:000001195 (sn2)	InChIKey=LJSBNBPNSBKZCI-JNOBRDIFSA-L	C33H57N3O15P2	-2.0	NaN	799.342142	...	806.357598	817.375968	798.334866	834.311543	858.355995	NaN	NaN	NaN	NaN	NaN
745173	SLM:000748035	Isomeric subspecies	SLM:000000084	NaN	SLM:000000048 (sn1) / SLM:000001196 (sn2)	InChIKey=ODNYDZLXLRZPCJ-GPTQCAHZSA-L	C35H61N3O15P2	-2.0	NaN	827.373442	...	834.388898	845.407268	826.366166	862.342844	886.387295	NaN	NaN	NaN	NaN	NaN
745174	SLM:000748036	Isomeric subspecies	SLM:000000084	NaN	SLM:000000048 (sn1) / SLM:000000853 (sn2)	InChIKey=FJIBTCUXUBRYKG-QOTCTSOZSA-L	C37H65N3O15P2	-2.0	NaN	855.404743	...	862.420198	873.438568	854.397466	890.374144	914.418595	NaN	NaN	NaN	NaN	NaN
745175	SLM:000748037	Isomeric subspecies	SLM:000000084	NaN	SLM:000000048 (sn1) / SLM:000001197 (sn2)	InChIKey=AIBKQADSQWEVSS-HUKRWTLJSA-L	C42H75N3O15P2	-2.0	NaN	925.482993	...	932.498448	943.516818	924.475716	960.452394	984.496846	NaN	NaN	HMDB0116248	NaN	NaN
745176	SLM:000748038	Isomeric subspecies	SLM:000000084	NaN	SLM:000000047 (sn2) / SLM:000000048 (sn1)	InChIKey=PIZFKSVTEGNINS-BQUKFSKHSA-L	C44H79N3O15P2	-2.0	NaN	953.514293	...	960.529748	971.548118	952.507016	988.483694	1012.528146	NaN	NaN	HMDB0116250	NaN	NaN

708725 rows × 24 columns

Checking split characters (/) in InChI key (pH7.3)
No rows found

Checking split characters (/) in Formula (pH7.3)
No rows found

Checking split characters (/) in Charge (pH7.3)
Not a string column

Checking split characters (/) in Mass (pH7.3)
Not a string column

Checking split characters (/) in Exact Mass (neutral form)
Not a string column

Checking split characters (/) in Exact m/z of [M.]+
Not a string column

Checking split characters (/) in Exact m/z of [M+H]+
Not a string column

Checking split characters (/) in Exact m/z of [M+K]+ 
Not a string column

Checking split characters (/) in Exact m/z of [M+Na]+
Not a string column

Checking split characters (/) in Exact m/z of [M+Li]+
Not a string column

Checking split characters (/) in Exact m/z of [M+NH4]+
Not a string column

Checking split characters (/) in Exact m/z of [M-H]-
Not a string column

Checking split characters (/) in Exact m/z of [M+Cl]-
Not a string column

Checking split characters (/) in Exact m/z of [M+OAc]- 
Not a string column

Checking split characters (/) in CHEBI
No rows found

Checking split characters (/) in LIPID MAPS
No rows found

Checking split characters (/) in HMDB
No rows found

Checking split characters (/) in MetaNetX
No rows found

Checking split characters (/) in PMID
No rows found

['Components*']

These double entries for the classes will be important to take into account for our class hierarchy, because if we don’t many of these Class level entries will become disjointed in the ontology.

To help us handle this connection we will split it into two using the split_and_expand_large utility function, but we will come back to this a bit later…

For now we will also add another column for components, so that later we can have both the actual component with location (e.g. sn) and a parsed version where we just have the SL

df_swisslipids['Components_parsed'] = df_swisslipids['Components*']

Now we can melt to start creating the edges df

Building the edges df¶

# # Split the 'Lipid class*' column into multiple rows
# df_swisslipids_splitexp = split_and_expand_large(
#     df_swisslipids, #.assign(from_layer_col='swisslipids')
#     split_col='Lipid class*', 
#     expand_cols=['Lipid ID', 'Level', 'Name', 'Abbreviation*',
#                     'CHEBI', 'LIPID MAPS', 'HMDB', 'MetaNetX', 'PMID','Synonyms*','Parent','Components*','Components_parsed'], #'from_layer_col'
#     delimiter='|'
# )

df_swisslipids_edges = pd.melt(df_swisslipids,  #df_swisslipids_splitexp
                id_vars=['Lipid ID'], 
                value_vars=['CHEBI','LIPID MAPS','HMDB','MetaNetX','PMID','Lipid class*','Abbreviation*','Synonyms*','Parent','Components*','Components_parsed'], 
                var_name='melted_column', value_name='value')
df_swisslipids_edges

	Lipid ID	melted_column	value
0	SLM:000000002	CHEBI	70846
1	SLM:000000003	CHEBI	70771
2	SLM:000000006	CHEBI	70829
3	SLM:000000007	CHEBI	70775
4	SLM:000000035	CHEBI	57817
...	...	...	...
8571734	SLM:000782324	Components_parsed	NaN
8571735	SLM:000782325	Components_parsed	NaN
8571736	SLM:000782326	Components_parsed	NaN
8571737	SLM:000782327	Components_parsed	NaN
8571738	SLM:000782328	Components_parsed	NaN

8571739 rows × 3 columns

Because this melt operation also resulted in a large number of null values, which probably mean nothing to us in this case, we will drop instances where the value is null

df_swisslipids_edges = df_swisslipids_edges.dropna(subset='value')
df_swisslipids_edges

	Lipid ID	melted_column	value
0	SLM:000000002	CHEBI	70846
1	SLM:000000003	CHEBI	70771
2	SLM:000000006	CHEBI	70829
3	SLM:000000007	CHEBI	70775
4	SLM:000000035	CHEBI	57817
...	...	...	...
8571494	SLM:000781997	Components_parsed	SLM:000000856 (n-acyl)
8571495	SLM:000781998	Components_parsed	SLM:000389154 (n-acyl)
8571496	SLM:000781999	Components_parsed	SLM:000485643 (n-acyl)
8571497	SLM:000782000	Components_parsed	SLM:000485644 (n-acyl)
8571498	SLM:000782001	Components_parsed	SLM:000485645 (n-acyl)

4678499 rows × 3 columns

There are still some things we need to tidy up so that it is in a suitable format for OnionNet

df_swisslipids_edges = df_swisslipids_edges.copy()
df_swisslipids_edges['source_layer'] = 'swisslipids'
df_swisslipids_edges.rename(columns={'Lipid ID':'source_id', 'melted_column':'target_layer', 'value':'target_id'}, inplace=True)
df_swisslipids_edges = df_swisslipids_edges[['source_layer','source_id','target_layer','target_id']]
df_swisslipids_edges['target_layer'] = df_swisslipids_edges['target_layer'].map(lambda x: 'swisslipids' if x=='Lipid class*' else f"sl_{str(x).replace(' ','').strip('*').lower()}")
#df_swisslipids_edges['target_layer'] = df_swisslipids_edges['target_layer'].map(lambda x: )
df_swisslipids_edges

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000002	sl_chebi	70846
1	swisslipids	SLM:000000003	sl_chebi	70771
2	swisslipids	SLM:000000006	sl_chebi	70829
3	swisslipids	SLM:000000007	sl_chebi	70775
4	swisslipids	SLM:000000035	sl_chebi	57817
...	...	...	...	...
8571494	swisslipids	SLM:000781997	sl_components_parsed	SLM:000000856 (n-acyl)
8571495	swisslipids	SLM:000781998	sl_components_parsed	SLM:000389154 (n-acyl)
8571496	swisslipids	SLM:000781999	sl_components_parsed	SLM:000485643 (n-acyl)
8571497	swisslipids	SLM:000782000	sl_components_parsed	SLM:000485644 (n-acyl)
8571498	swisslipids	SLM:000782001	sl_components_parsed	SLM:000485645 (n-acyl)

4678499 rows × 4 columns

For rows where it is swisslipids to swisslipids, we actually want to correct this from target_layer to source_layer, because currently the target_layer in this case is actually the parent class, and ideally it would be better to have the parent point towards the children, so that way the root node should be the one with multiple outgoing edges and no incoming edges…

Be sure to only run this once, otherwise it will switch back again…

# Identify rows where both source_layer and target_layer are 'swisslipids'
condition = (df_swisslipids_edges["source_layer"] == "swisslipids") & (df_swisslipids_edges["target_layer"] == "swisslipids")

# Swap the columns for rows satisfying the condition
df_swisslipids_edges.loc[condition, ["source_layer", "source_id", "target_layer", "target_id"]] = df_swisslipids_edges.loc[condition, ["target_layer", "target_id", "source_layer", "source_id"]].values

# Output the modified DataFrame
df_swisslipids_edges

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000002	sl_chebi	70846
1	swisslipids	SLM:000000003	sl_chebi	70771
2	swisslipids	SLM:000000006	sl_chebi	70829
3	swisslipids	SLM:000000007	sl_chebi	70775
4	swisslipids	SLM:000000035	sl_chebi	57817
...	...	...	...	...
8571494	swisslipids	SLM:000781997	sl_components_parsed	SLM:000000856 (n-acyl)
8571495	swisslipids	SLM:000781998	sl_components_parsed	SLM:000389154 (n-acyl)
8571496	swisslipids	SLM:000781999	sl_components_parsed	SLM:000485643 (n-acyl)
8571497	swisslipids	SLM:000782000	sl_components_parsed	SLM:000485644 (n-acyl)
8571498	swisslipids	SLM:000782001	sl_components_parsed	SLM:000485645 (n-acyl)

4678499 rows × 4 columns

df_swisslipids_edges['target_layer'].value_counts()

target_layer
swisslipids             779247
sl_abbreviation         776464
sl_components           765323
sl_components_parsed    765323
sl_synonyms             548163
sl_metanetx             505003
sl_parent               493491
sl_hmdb                  26026
sl_lipidmaps             12117
sl_chebi                  4276
sl_pmid                   3066
Name: count, dtype: int64

Now let’s return to two items on our todo list:

splitting values that have multi-identifiers
trimming/parsing the components col

edges_with_multilinks = df_swisslipids_edges[df_swisslipids_edges['target_id'].str.contains('|', regex=False, na=False)]
edges_with_multilinks

	source_layer	source_id	target_layer	target_id
465	swisslipids	SLM:000000784	sl_chebi	74546 \| 82922
387185	swisslipids	SLM:000389154	sl_chebi	82731 \| CHEBI:82731
595221	swisslipids	SLM:000598072	sl_chebi	17336 \| 83228
3116996	swisslipids	SLM:000000002	sl_pmid	\| 11443131 \| 14685263 \| 18390550 \| 21325339 \|...
3116999	swisslipids	SLM:000000007	sl_pmid	14685263 \| 21926990 \| 9603947
...	...	...	...	...
6199835	swisslipids	SLM:000747954	sl_synonyms	1,2-di-(13-methyltetradecanoyl)-sn-glycero-3-c...
6199836	swisslipids	SLM:000747955	sl_synonyms	1-(13-methyltetradecanoyl)-2-(15-methylhexadec...
6199918	swisslipids	SLM:000748037	sl_synonyms	1-(15-methylhexadecanoyl)-2-(11-methyldodecano...
6199919	swisslipids	SLM:000748038	sl_synonyms	1-(15-methylhexadecanoyl)-2-(13-methyltetradec...
6199920	swisslipids	SLM:000748039	sl_synonyms	1,2-di-(15-methylhexadecanoyl)-sn-glycero-3-cy...

30942 rows × 4 columns

edges_with_multilinks.value_counts('target_layer')

target_layer
sl_synonyms        19853
sl_abbreviation     9768
sl_pmid             1318
sl_chebi               3
Name: count, dtype: int64

edges_with_multilinks_split = split_and_expand_large(edges_with_multilinks, 
                       split_col='target_id', 
                       expand_cols=['source_layer','source_id','target_layer'],
                       delimiter='|').drop_duplicates()
edges_with_multilinks_split

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000784	sl_chebi	74546
1	swisslipids	SLM:000000784	sl_chebi	82922
2	swisslipids	SLM:000389154	sl_chebi	82731
3	swisslipids	SLM:000389154	sl_chebi	CHEBI:82731
4	swisslipids	SLM:000598072	sl_chebi	17336
...	...	...	...	...
68383	swisslipids	SLM:000748037	sl_synonyms	CDP-DG(22:6(4Z,7Z,10Z,13Z,16Z,19Z)/18:1(11Z))
68384	swisslipids	SLM:000748038	sl_synonyms	1-(15-methylhexadecanoyl)-2-(13-methyltetradec...
68385	swisslipids	SLM:000748038	sl_synonyms	CDP-DG(22:6(4Z,7Z,10Z,13Z,16Z,19Z)/18:1(9Z))
68386	swisslipids	SLM:000748039	sl_synonyms	1,2-di-(15-methylhexadecanoyl)-sn-glycero-3-cy...
68387	swisslipids	SLM:000748039	sl_synonyms	CDP-DG(22:6(4Z,7Z,10Z,13Z,16Z,19Z)/18:2(9Z,12Z))

68380 rows × 4 columns

This is good, but we also need to remember the separators in the components column

edges_with_multilinks2 = df_swisslipids_edges[df_swisslipids_edges['target_id'].str.contains('/', regex=False, na=False) &
                     df_swisslipids_edges['target_layer'].str.contains('sl_components', regex=False, na=False)]
edges_with_multilinks2

	source_layer	source_id	target_layer	target_id
7013405	swisslipids	SLM:000000422	sl_components	SLM:000000418 (sn2) / SLM:000000510 (sn1)
7013470	swisslipids	SLM:000000498	sl_components	SLM:000000296 (sn2) / SLM:000000826 (sn1)
7013510	swisslipids	SLM:000000557	sl_components	SLM:000000510 (sn1) / SLM:000000826 (sn2)
7013573	swisslipids	SLM:000000636	sl_components	SLM:000000418 (sn1) / SLM:000000510 (sn2)
7013574	swisslipids	SLM:000000637	sl_components	SLM:000000418 (sn1) / SLM:000000826 (sn2)
...	...	...	...	...
8537662	swisslipids	SLM:000748034	sl_components_parsed	SLM:000000048 (sn1) / SLM:000001195 (sn2)
8537663	swisslipids	SLM:000748035	sl_components_parsed	SLM:000000048 (sn1) / SLM:000001196 (sn2)
8537664	swisslipids	SLM:000748036	sl_components_parsed	SLM:000000048 (sn1) / SLM:000000853 (sn2)
8537665	swisslipids	SLM:000748037	sl_components_parsed	SLM:000000048 (sn1) / SLM:000001197 (sn2)
8537666	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000047 (sn2) / SLM:000000048 (sn1)

1417450 rows × 4 columns

edges_with_multilinks2_split = split_and_expand_large(edges_with_multilinks2, 
                       split_col='target_id', 
                       expand_cols=['source_layer','source_id','target_layer'],
                       delimiter='/').drop_duplicates()
edges_with_multilinks2_split

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000422	sl_components	SLM:000000418 (sn2)
1	swisslipids	SLM:000000422	sl_components	SLM:000000510 (sn1)
2	swisslipids	SLM:000000498	sl_components	SLM:000000296 (sn2)
3	swisslipids	SLM:000000498	sl_components	SLM:000000826 (sn1)
4	swisslipids	SLM:000000557	sl_components	SLM:000000510 (sn1)
...	...	...	...	...
3592487	swisslipids	SLM:000748036	sl_components_parsed	SLM:000000853 (sn2)
3592488	swisslipids	SLM:000748037	sl_components_parsed	SLM:000000048 (sn1)
3592489	swisslipids	SLM:000748037	sl_components_parsed	SLM:000001197 (sn2)
3592490	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000047 (sn2)
3592491	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000048 (sn1)

3592492 rows × 4 columns

Now let’s also parse the brackets from the parsed components so that these can be linked directly to the other SLMs if needed

# Apply transformation only for rows where target_layer equals 'sl_components_parsed'
mask = edges_with_multilinks2_split['target_layer'] == 'sl_components_parsed'
edges_with_multilinks2_split.loc[mask, 'target_id'] = edges_with_multilinks2_split.loc[mask, 'target_id'].str.split('(').str[0].str.strip()
edges_with_multilinks2_split

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000422	sl_components	SLM:000000418 (sn2)
1	swisslipids	SLM:000000422	sl_components	SLM:000000510 (sn1)
2	swisslipids	SLM:000000498	sl_components	SLM:000000296 (sn2)
3	swisslipids	SLM:000000498	sl_components	SLM:000000826 (sn1)
4	swisslipids	SLM:000000557	sl_components	SLM:000000510 (sn1)
...	...	...	...	...
3592487	swisslipids	SLM:000748036	sl_components_parsed	SLM:000000853
3592488	swisslipids	SLM:000748037	sl_components_parsed	SLM:000000048
3592489	swisslipids	SLM:000748037	sl_components_parsed	SLM:000001197
3592490	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000047
3592491	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000048

3592492 rows × 4 columns

Now we need a way to change these original rows where they had multilinks and add back the corrected ones.

# Identify rows with multilinks (either '|' or '/' with the specific target_layer condition)
mask_pipe = df_swisslipids_edges['target_id'].str.contains('|', regex=False, na=False)
mask_slash = (
    df_swisslipids_edges['target_id'].str.contains('/', regex=False, na=False) &
    df_swisslipids_edges['target_layer'].str.contains('sl_components', regex=False, na=False)
)
mask_problem = mask_pipe | mask_slash

# Remove these rows from the original df
df_clean = df_swisslipids_edges[~mask_problem].copy()

# Now, combine the cleaned df with the corrected edges dataframes.
# These corrected dataframes are assumed to be: 
#   - edges_with_multilinks_split
#   - edges_with_multilinks2_split
df_swisslipids_edges = pd.concat([df_clean, edges_with_multilinks_split, edges_with_multilinks2_split], ignore_index=True)

# (Optional) Drop any duplicate rows that might arise
df_swisslipids_edges = df_swisslipids_edges.drop_duplicates()

# df_final now contains the original "good" rows plus the corrected edges.
df_swisslipids_edges

	source_layer	source_id	target_layer	target_id
0	swisslipids	SLM:000000002	sl_chebi	70846
1	swisslipids	SLM:000000003	sl_chebi	70771
2	swisslipids	SLM:000000006	sl_chebi	70829
3	swisslipids	SLM:000000007	sl_chebi	70775
4	swisslipids	SLM:000000035	sl_chebi	57817
...	...	...	...	...
6890974	swisslipids	SLM:000748036	sl_components_parsed	SLM:000000853
6890975	swisslipids	SLM:000748037	sl_components_parsed	SLM:000000048
6890976	swisslipids	SLM:000748037	sl_components_parsed	SLM:000001197
6890977	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000047
6890978	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000048

6890979 rows × 4 columns

Now we will determine whether the edge is within the same layer (intralayer) or between different layers (interlayer)

def assess_edge_layertype(df):
    interlayer = df['source_layer']!=df['target_layer']
    df['interlayer'] = interlayer
    return df 

df_swisslipids_edges = assess_edge_layertype(df_swisslipids_edges)
df_swisslipids_edges

	source_layer	source_id	target_layer	target_id	interlayer
0	swisslipids	SLM:000000002	sl_chebi	70846	True
1	swisslipids	SLM:000000003	sl_chebi	70771	True
2	swisslipids	SLM:000000006	sl_chebi	70829	True
3	swisslipids	SLM:000000007	sl_chebi	70775	True
4	swisslipids	SLM:000000035	sl_chebi	57817	True
...	...	...	...	...	...
6890974	swisslipids	SLM:000748036	sl_components_parsed	SLM:000000853	True
6890975	swisslipids	SLM:000748037	sl_components_parsed	SLM:000000048	True
6890976	swisslipids	SLM:000748037	sl_components_parsed	SLM:000001197	True
6890977	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000047	True
6890978	swisslipids	SLM:000748038	sl_components_parsed	SLM:000000048	True

6890979 rows × 5 columns

Now we will build the node df

Building the node df¶

df_swisslipids_nodes = create_nodedf_from_edgedf(edge_df=df_swisslipids_edges, props=['layer', 'id'], cols=['layer', 'node_id'])
df_swisslipids_nodes

	layer	node_id
0	swisslipids	SLM:000000002
1	swisslipids	SLM:000000003
2	swisslipids	SLM:000000006
3	swisslipids	SLM:000000007
4	swisslipids	SLM:000000035
...	...	...
13781953	sl_components_parsed	SLM:000000853
13781954	sl_components_parsed	SLM:000000048
13781955	sl_components_parsed	SLM:000001197
13781956	sl_components_parsed	SLM:000000047
13781957	sl_components_parsed	SLM:000000048

13781958 rows × 2 columns

Let’s also see how many are duplicates

df_swisslipids_nodes.value_counts(dropna=True)

layer        node_id      
swisslipids  SLM:000000353    132660
             SLM:000000377     98800
             SLM:000000102     80218
             SLM:000117148     46826
             SLM:000000400     38525
                               ...  
sl_metanetx  MNXM311776            1
             MNXM311777            1
             MNXM311778            1
             MNXM311779            1
swisslipids  SLM:000782332         1
Name: count, Length: 2783345, dtype: int64

Now let’s merge the nodes with the information from earlier to create richer node attributes

df_swisslipids_nodes = pd.merge(df_swisslipids_nodes, df_swisslipids.assign(from_layer_col='swisslipids'),
                                left_on=['layer','node_id'], right_on=['from_layer_col','Lipid ID'],
                                how='outer')
df_swisslipids_nodes

	layer	node_id	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	...	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID	Components_parsed	from_layer_col
0	sl_abbreviation	(5S)-HpHEPE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	sl_abbreviation	15-KETE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	sl_abbreviation	(10,11S,12R)-TriHETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	sl_abbreviation	(10R)-H-(11S,12S)-EpETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	sl_abbreviation	(10R)-H-(8S,9S)-EpETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
13781953	swisslipids	SLM:000782330	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781954	swisslipids	SLM:000782331	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781955	swisslipids	SLM:000782331	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781956	swisslipids	SLM:000782331	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781957	swisslipids	SLM:000782332	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

13781958 rows × 33 columns

This has a lot of duplicates in it, so lets remove them, along with the from_layer_col which means nothing in this case and is just a relic of our join back with the initial df we used to create the edges (which could probably be tidied up)

df_swisslipids_nodes = df_swisslipids_nodes.drop_duplicates()
df_swisslipids_nodes = df_swisslipids_nodes.drop(columns='from_layer_col')
df_swisslipids_nodes

	layer	node_id	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	...	Exact m/z of [M+NH4]+	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID	Components_parsed
0	sl_abbreviation	(5S)-HpHEPE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	sl_abbreviation	15-KETE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	sl_abbreviation	(10,11S,12R)-TriHETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	sl_abbreviation	(10R)-H-(11S,12S)-EpETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	sl_abbreviation	(10R)-H-(8S,9S)-EpETrE	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
13781947	swisslipids	SLM:000782328	SLM:000782328	NaN	oxidized 2-acylglycerol	NaN	NaN	SLM:000000355	NaN	NaN	...	NaN	NaN	NaN	NaN	167117	NaN	NaN	NaN	NaN	NaN
13781950	swisslipids	SLM:000782329	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781953	swisslipids	SLM:000782330	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781954	swisslipids	SLM:000782331	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
13781957	swisslipids	SLM:000782332	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

2783345 rows × 32 columns

Now we have the nodes and edges dfs for swisslipids and understand how we have arrived at them. In reality you don’t have to go through this process every time, LipiNet offers a convenient function to do just this if you are interested in this same network setup.

Using the LipiNet `parse_swisslipids` function¶

The LipiNet parse_swisslipids function automatically runs through all of the same steps as we have just covered.

from lipinet.parse_swisslipids import parse_swisslipids_data

sl_results = parse_swisslipids_data(verbose=False)
df_sl_nodes = sl_results['df_nodes']
df_sl_edges = sl_results['df_edges']

We can also check to make sure these are equal here for an individual entry

df_swisslipids_nodes.iloc[0]

layer                        sl_abbreviation
node_id                          (5S)-HpHEPE
Lipid ID                                 NaN
Level                                    NaN
Name                                     NaN
Abbreviation*                            NaN
Synonyms*                                NaN
Lipid class*                             NaN
Parent                                   NaN
Components*                              NaN
SMILES (pH7.3)                           NaN
InChI (pH7.3)                            NaN
InChI key (pH7.3)                        NaN
Formula (pH7.3)                          NaN
Charge (pH7.3)                           NaN
Mass (pH7.3)                             NaN
Exact Mass (neutral form)                NaN
Exact m/z of [M.]+                       NaN
Exact m/z of [M+H]+                      NaN
Exact m/z of [M+K]+                      NaN
Exact m/z of [M+Na]+                     NaN
Exact m/z of [M+Li]+                     NaN
Exact m/z of [M+NH4]+                    NaN
Exact m/z of [M-H]-                      NaN
Exact m/z of [M+Cl]-                     NaN
Exact m/z of [M+OAc]-                    NaN
CHEBI                                    NaN
LIPID MAPS                               NaN
HMDB                                     NaN
MetaNetX                                 NaN
PMID                                     NaN
Components_parsed                        NaN
Name: 0, dtype: object

df_sl_nodes.iloc[0]

layer                        sl_abbreviation
node_id                          (5S)-HpHEPE
Lipid ID                                 NaN
Level                                    NaN
Name                                     NaN
Abbreviation*                            NaN
Synonyms*                                NaN
Lipid class*                             NaN
Parent                                   NaN
Components*                              NaN
SMILES (pH7.3)                           NaN
InChI (pH7.3)                            NaN
InChI key (pH7.3)                        NaN
Formula (pH7.3)                          NaN
Charge (pH7.3)                           NaN
Mass (pH7.3)                             NaN
Exact Mass (neutral form)                NaN
Exact m/z of [M.]+                       NaN
Exact m/z of [M+H]+                      NaN
Exact m/z of [M+K]+                      NaN
Exact m/z of [M+Na]+                     NaN
Exact m/z of [M+Li]+                     NaN
Exact m/z of [M+NH4]+                    NaN
Exact m/z of [M-H]-                      NaN
Exact m/z of [M+Cl]-                     NaN
Exact m/z of [M+OAc]-                    NaN
CHEBI                                    NaN
LIPID MAPS                               NaN
HMDB                                     NaN
MetaNetX                                 NaN
PMID                                     NaN
Components_parsed                        NaN
Name: 0, dtype: object

For the first entry it looks good, what about for the entire df? We can use the pd.testing.assert_frame_equal function to do this.

First we will use a null test to test equality between df_swisslipids_nodes and df_swisslipids_edges, which should obviously be False.

try:
    pd.testing.assert_frame_equal(df_swisslipids_nodes, df_swisslipids_edges)
    print('DataFrames are equal')
except AssertionError as e:
    print(e)

DataFrame are different

DataFrame shape mismatch
[left]:  (2783345, 32)
[right]: (6890979, 5)

Now let’s test between df_swisslipids_nodes and df_sl_nodes, which should hopefully be True and not throw an error. We will also test the edges df while we’re at it too.

try:
    pd.testing.assert_frame_equal(df_swisslipids_nodes, df_sl_nodes)
    print('DataFrames for nodes are equal')
except AssertionError as e:
    print(e)

DataFrames for nodes are equal

try:
    pd.testing.assert_frame_equal(df_swisslipids_edges, df_sl_edges)
    print('DataFrames for edges are equal')
except AssertionError as e:
    print(e)

DataFrames for edges are equal

Great! It looks like both approaches achieve the same df. We will use these dfs in other parts of the package.

If they are different, we can inspect the exact rows here

diff = df_sl_edges.merge(df_swisslipids_edges, how='outer', indicator=True)
diff_rows_edges = diff[diff['_merge'] != 'both']
diff_rows_edges

	source_layer	source_id	target_layer	target_id	interlayer	_merge

diff = df_sl_nodes.merge(df_swisslipids_nodes, how='outer', indicator=True)
diff_rows_nodes = diff[diff['_merge'] != 'both']
diff_rows_nodes

	layer	node_id	Lipid ID	Level	Name	Abbreviation*	Synonyms*	Lipid class*	Parent	Components*	...	Exact m/z of [M-H]-	Exact m/z of [M+Cl]-	Exact m/z of [M+OAc]-	CHEBI	LIPID MAPS	HMDB	MetaNetX	PMID	Components_parsed	_merge

0 rows × 33 columns

These should also be the same

df_sl_edges[df_sl_edges['source_id']=='SLM:000389145']

	source_layer	source_id	target_layer	target_id	interlayer
1640	swisslipids	SLM:000389145	sl_chebi	18059	True
429400	swisslipids	SLM:000389145	sl_metanetx	MNXM12117	True
549344	swisslipids	SLM:000389145	swisslipids	SLM:000000436	False
549407	swisslipids	SLM:000389145	swisslipids	SLM:000000525	False
549887	swisslipids	SLM:000389145	swisslipids	SLM:000001193	False
665828	swisslipids	SLM:000389145	swisslipids	SLM:000117142	False
936914	swisslipids	SLM:000389145	swisslipids	SLM:000390054	False
1046948	swisslipids	SLM:000389145	swisslipids	SLM:000500463	False
1055230	swisslipids	SLM:000389145	swisslipids	SLM:000508860	False
1328368	swisslipids	SLM:000389145	swisslipids	SLM:000782283	False

df_swisslipids_edges[df_swisslipids_edges['source_id']=='SLM:000389145']

	source_layer	source_id	target_layer	target_id	interlayer
1640	swisslipids	SLM:000389145	sl_chebi	18059	True
429400	swisslipids	SLM:000389145	sl_metanetx	MNXM12117	True
549344	swisslipids	SLM:000389145	swisslipids	SLM:000000436	False
549407	swisslipids	SLM:000389145	swisslipids	SLM:000000525	False
549887	swisslipids	SLM:000389145	swisslipids	SLM:000001193	False
665828	swisslipids	SLM:000389145	swisslipids	SLM:000117142	False
936914	swisslipids	SLM:000389145	swisslipids	SLM:000390054	False
1046948	swisslipids	SLM:000389145	swisslipids	SLM:000500463	False
1055230	swisslipids	SLM:000389145	swisslipids	SLM:000508860	False
1328368	swisslipids	SLM:000389145	swisslipids	SLM:000782283	False

Parsing SwissLipids into a network for LipiNet¶

Parsing the manual way¶

Building the edges df¶

Building the node df¶

Using the LipiNet `parse_swisslipids` function¶

LipiNet

Navigation

Related Topics

Parsing SwissLipids into a network for LipiNet¶

Parsing the manual way¶

Building the edges df¶

Building the node df¶

Using the LipiNet parse_swisslipids function¶

Using the LipiNet `parse_swisslipids` function¶