Natalia Garcia Bioinformatician and Data Scientist

My Expertise

Computational Biologist with a background in Biotechnnology enthusiastic in coding and learning ML and semantic technology applications in Health and biology

DataScience Stack

MLOps · KG · RAG · NLP · GenAI. Stack: TensorFlow · Pytorch · SparkNLP HealthcareNLP · LangChain. ETL development · Databases: MySQL · MongoDB · RDF · SPARQL

Code

Languages: Python · R · Scala · Shell · Perl · Apache Spark SQL · GraphX. Frameworks and containerization: Docker · Flask · Orchestration: Apache Airflow · Oozie · Nextflow

Tools

Bionformatics tools and pipelines: Epigenomics · Transcriptomics · Metagenomics · Genomics. Interactive Data Visualization: Shiny · Plotly · Mathematical Modelling. Ontologies and Medical Terminologies: SNOMED-CT · UMLS · OMOP

Featured Projects

aipneu

Pneumonia-identification-from-X-Ray-images

  • Featured Skills

Code implementing Spark-based machine learning models for classification of X-Ray images from a Pediatric Chest X-Ray Image dataset, with the purpose of predicting Pneumonia in patients. More specifically, these models are implemented using the high level pyspark interfaces and BigDL as a platform to develop scalable deep learning applications

Check it out
ocr

OCR and webscraping in Spanish old newspapers

  • Featured Skills

Project performing webscraping of HTML with ancestry information

Check it out