Citian

Sr. Data Scientist

Taiwan Full-time

Description

Citian is expanding and we are seeking a Senior Data Scientist to join our Team. We're looking for someone who is not only passionate, driven, and curious but also possesses extensive experience and expertise in the field of data science.
Who We Are:
Citian is a fast-growing SaaS technology company based in Washington, DC. Our software solutions revolutionize how our transportation systems – roads, rail, transit, bicycle, pedestrian – operate.
Our tech solutions:
Reduce traffic fatalities
Enhance pedestrian accessibility
Empower system operators to save time & money
You’ll work with some of the brightest minds in the software industry. Our software engineers apply the latest in emerging tech, Artificial Intelligence and Machine Learning to build smarter, more advanced tools for our diverse client base. We work with clients across the United States, with global ambitions in the years ahead.
Who You Are:
As a Senior Data Scientist, you will play a pivotal role in expanding our databases, cleaning large datasets, and driving data-driven insights. You should have a minimum of 7 years of experience working with large datasets in a Python/SQL environment, along with a proven track record of delivering impactful data science solutions. You'll collaborate closely with other developers to enhance our software lineup and deliver contracted solutions for engineers.

Responsibilities

  • Model Development: Develop and optimize data science models to process, analyze, and extract information from diverse data sources, with a focus on textual data using techniques such as natural language processing (NLP).
  • Machine Learning and AI: Apply machine learning and artificial intelligence techniques to build predictive and prescriptive models.
  • Data Preprocessing: Perform data cleansing, normalization, and transformation on large datasets using Python libraries (e.g., Pandas, NumPy) to prepare them for analysis and model training.
  • Feature Engineering: Identify and engineer relevant features to enhance model performance and accuracy.
  • Model Evaluation: Design and implement robust evaluation metrics (e.g., precision, recall, F1-score) and frameworks (e.g., cross-validation) to assess the performance of machine learning models.
  • Collaboration: Work closely with cross-functional teams, including engineers, product managers, and domain experts, to understand business requirements and deliver data science solutions that meet those needs.
  • Research: Stay updated on the latest advancements in NLP and AI research by reading scientific literature and attending conferences, and apply these advancements to real-world problems in transportation systems.

Expected Duties:

  • Lead the development and implementation of data science initiatives by performing data cleansing, conducting exploratory data analysis, and building predictive models using Python and SQL.
  • Provide technical leadership and guidance to junior data scientists

Mentor team members, providing support and fostering professional growth

  • Collaborate with cross-functional teams through version control systems like GitLab
  • Effectively communicate project deadlines and requirements to team members
  • Conduct research to develop innovative solutions for transportation engineering challenges, utilizing the latest advancements in machine learning and AI.
  • Participate in brainstorming sessions to innovate and explore new approaches to problem-solving
  • Implement advanced data science tools and techniques to address complex data challenges, such as optimizing model performance and scaling data processing workflows.
  • Consistently deliver projects on time while upholding high standards of quality.

Hard Requirements

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Statistics, or a related field
  • Strong background in Mathematics and Statistics
  • Minimum of 7 years of professional programming experience with a focus on data science
  • 4-6 years of experience in natural language processing, machine learning, artificial intelligence, and data science leadership
  • Python, SQL, Python data science libraries (Pandas, Scikit-learn, etc.), Python NLP libraries (NLTK, SpaCy, etc.)
  • Strong knowledge of machine learning techniques and AI algorithms.
  • Solid understanding of data preprocessing, feature engineering, and model evaluation
  • Preference for experience with machine learning + LLMs (HuggingFace, OpenAI, PyTorch, TensorFlow, etc.), and/or with model deployment + evaluation, Snowflake, AWS
  • Demonstrated expertise in developing and deploying data science solutions in Python/SQL environments
  • Extensive experience with professional-grade cloud databases and data warehouse such as Snowflake
  • Ability to effectively communicate complex algorithms and their implications
  • Proficiency in SQL for data retrieval and storage
  • Excellent problem-solving skills and the ability to evaluate data structures for optimal performance
  • Strong programming organization and documentation skills
  • Proven ability to collaborate effectively with cross-functional teams
  • Strong work ethic, commitment to excellence, and a passion for innovation

Soft Requirements

Why Citian?

  • Competitive salary and benefits package with 401k match.
  • Opportunity to work on cutting-edge data science projects in the transportation sector.
  • Collaborative and innovative work environment.
  • Continuous learning and professional development opportunities.
  • A chance to make a significant impact in a rapidly evolving industry.

Your Citian Advantage:

  • Discretionary annual merit-based bonus and annual raise
  • Strong medical, vision, and dental insurance
  • Commuter subsidy benefit
  • Tuition reimbursement assistance
  • On-site gym and free snacks in office
  • and more!
Terminal 1 Badge & Skills
Data Scientist
Data Scientist
Level 3
Skills Required for this Badge:
Machine Learning
Web Analytics
Statistics and Probability
Modern Scripting and Command Line
Business Metrics and Accounting
Dashboards and Visualization