data science
core competencies: data analysis, machine learning, deep learning, natural language processing, big data
programming: Python, R, SQL
python libraries: Scikit-learn, TensorFlow, PyTorch, XGBoost, Hugging Face Transformers, Matplotlib, Seaborn
databases & cloud: postgreSQL, MS SQL server, oracle, neo4j
projects
party differences in minority-related discourse in german election manifestos: a text analysis approach
This project collects data on parliamentary speaches in Germany that deal with topics of equality, minorities, discrimination and societal norms. It then analyses the speeches using TF-IDF, term co-occurance and topic modelling. The research question is: How do German parliamentary parties differ in the themes and keywords they use when speaking about minorities?
view on github
exploring gender's impact on co-authorship networks in cancer research
Using a subsample of the PubMed Knowledge graph comprising over 31 million articles authored by 18 million researchers from 1781 to 2022, this project analyzes the gender disparities in two biomedical fields. By combining network-analytical tools with machine-learning-based classification of researchers’ gender, it investigates the question on whether research topic and gender ratio within the field are related by analyzing co-authorship networks.
view on github