Data Scientist - Berlin, Deutschland - Themis
Beschreibung
Want to be at the forefront of creating the next generation of collaborative video creators? We are growing our tech-team and are looking for a (senior) machine learning engineer with a focus on NLP and expertise in text extraction from PDFs, machine translation, and/or text classification.
The text extraction solutions you develop will need to work for the following formats; HTML, PPTX, PDF and be able to convert text into a standard format to enable the further document processing phases to be standardized.
Tasks
- Building scalable solutions and RESTful APIs to expose models as a service
- Working collaboratively with data and engineering team, to build predictive models that drive business value
- Staying up to date with new innovations in the field of machine learning
- Proactively owning and driving projects forward
- Embracing the ML project structure, keeping GitHub updated and regularly documenting all relevant models and code
Requirements:
- Degree in Computer Science, NLP, Cognitive Science, Human Computer-Interaction, Language Technology, Computational, Linguistics, or a closely related field
- Proven experience in developing, training and finetuning machine learning text processing models on large datasets
- Experience working on text and document structure extraction from common formats, information retrieval, text classification, and/or machine translation
- Adopt DevOps & CI/CD methodologies to collaborate on a growing platform
- 3+ years coding experience with 2+ years working in MLOps deploying production models
- Comfortable coding independently in Python
Due to public funding requirements, we can only consider applicants with a working permit for Germany that are based in Berlin.
Themis embraces diversity and equal opportunity in a serious way. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. The more inclusive we are, the better our work will be.