Cytotoxicity of nanoparticles in normal and cancer cell lines¶
Original Data¶
Title: Cytotoxicity of nanoparticles in normal and cancer cell lines
Description: The dataset contains information on the toxicity of inorganic nanoparticles on various normal and cancer cell lines.
Total number of records: 5476
Number of features (columns): 34
Data type: Mixed
Application: Nanomaterials
Automatic validation: Yes
Data Scheme¶
Cytotoxicity Dataset – Column Descriptions¶
Column Name | Description |
---|---|
sn | Index of the sample (serial number). |
material | Composition of the nanoparticle/material tested (e.g., SiO₂, Ag). |
shape | Physical shape of the particle (e.g., Sphere, Rod). |
coat_functional_group | Surface coating or functionalization (e.g., CTAB, PEG). |
synthesis_method | Synthesis method (e.g., Precipitation, Commercial). |
surface_charge | Reported surface charge (e.g., Negative, Positive). |
size_in_medium_nm | Hydrodynamic size in biological medium (nm). |
zeta_in_medium_mv | Zeta potential in medium (mV; blank if not measured). |
potential_mv | Surface charge in solution (mV). |
no_of_cells_cells_well | Cell density per well in the assay. |
cell_type | Specific cell line name (e.g., PC12, A549). |
human_animal | Origin of cells (A = Animal, H = Human). |
cell_source | Species/organism (e.g., Rat, Human). |
cell_tissue | Tissue origin of the cell line (e.g., Adrenal Gland, Lung). |
cell_morphology | Cell shape (e.g., Irregular, Epithelial). |
cell_age | Developmental stage of cells (e.g., Adult, Embryonic). |
time_hr | Exposure duration in hours. |
concentration | Tested concentration of the material (e.g., µg/mL). |
test | Cytotoxicity assay type (e.g., MTT, LDH). |
test_indicator | Reagent measured (e.g., TetrazoliumSalt for MTT). |
viability_% | Cell viability percentage relative to control. |
doi | Digital Object Identifier (DOI) of the article. |
article_list | Identifier for the article in the dataset. |
core_nm | Primary particle size (nm). |
hydrodynamic_nm | Size in solution including coatings (nm). |
journal_name | Name of the publishing journal. |
publisher | Publisher of the article. |
year | Year of publication. |
title | Title of the article from which data was extracted. |
journal_is_oa | Whether the journal is Open Access (TRUE/FALSE). |
is_oa | Whether the specific article is Open Access (TRUE/FALSE). |
oa_status | Open Access status (e.g., hybrid, gold, closed). |
Filename of the corresponding PDF in the PDF archive. | |
access | Access status (1 = open, 0 = closed). |
Metadata¶
All metadata is included in the main table and also used for source validation.
Column Name | Description |
---|---|
journal_name | Name of the publishing journal. |
publisher | Publisher of the article. |
year | Year of publication. |
title | Title of the article. |
journal_is_oa | Whether the journal is Open Access (TRUE/FALSE). |
is_oa | Whether the specific article is Open Access (TRUE/FALSE). |
oa_status | Open Access status at article level (e.g., hybrid, gold, closed). |
File name of the article PDF in archive. | |
access | Access code (1 = open access, 0 = closed access). |
Key Notes
human/animal
: "A" for Animal, "H" for Humanviability (%)
: Values >100% may indicate proliferation stimulation- Blanks: Missing values (e.g., zeta in medium) imply unreported data
Dataset Description¶
The dataset contains experimental data on the effects of inorganic nanoparticles on human and animal cells. The main focus is on the assessment of cytotoxicity using standard tests (e.g., MTT) to understand how different nanoparticle characteristics (shape, size, charge, coating) affect cell survival.
Each entry includes:
- Nanoparticle characteristics (material, shape, size, charge, etc.)
- Cell line information (source, tissue, morphology, etc.)
- Experimental conditions (exposure time, concentration, etc.)
- Results of cell survival tests
- Publication metadata (DOI, journal, year, etc.)
The aim of the dataset is to contribute to the investigation of the patterns between nanoparticle parameters and their cytotoxic effects.
Validation Results¶
In the Cytotox dataset, a total of 1,351 corrections were identified, nearly all of which (1,350) were pattern-based.
These were mostly found in the cell_type
, test
, and test_indicator
columns.
The predominant problems involved terminology inconsistencies, abbreviation mismatches, and stylistic variation across sources.
The high pattern consistency enabled efficient rule-based correction.