USNLX Diversity Jobs

USNLX Diversity Careers

Job Information

Simpson Thacher & Bartlett LLP Data Scientist in New York, New York

Description/Job Summary

Simpson Thacher's Data Scientist will play a key role in delivering insights to the Firm's leadership, legal practices, Knowledge Department and other administrative functions. To do so, this expert will use a mix of statistics, machine learning, deep learning and LLMs to derive meaning, build models and create systems that capitalize on both structured and unstructured data. This position will play a crucial role in pushing the Firm's artificial intelligence (AI) efforts forward, working on some of the most advanced projects in the legal industry.

The Data Scientist position sits within the Firm's Data Analytics Group, which is a part of the Firm's Knowledge Department. This position will work closely with lawyers and legal support professionals across practices, as well as other technical resources within the Firm. This role requires creative problem solving, analytical rigor, technical skill and an appreciation for the business and practice of law.

Responsibilities/Duties

  • Support legal teams and relevant operational staff in delivering on opportunities to use data to drive decision-making and improve the efficiency and effectiveness of the Firm's client representations

  • Collaborate with Firm functional departments (e.g., Finance, Talent, Business Development, IT) to analyze data and develop solutions to support operational objectives

  • Develop regression and classification models using established and emerging data science methodologies

  • Chain, fine-tune and deploy pre-trained language models (e.g., BERT, Llama3, etc.) to optimize performance on a range of NLP tasks, including text classification, named entity recognition, and generative tasks such as summarization, clause and document generation, and question-answer exchanges

  • Design and deploy document segmentation and embedding approaches to facilitate information retrieval and retrieval augmented generation (RAG)

  • Conduct advanced quantitative research, using machine learning (ML) and natural language processing (NLP) techniques to understand patterns in large volumes of data, identify relationships, detect data anomalies and classify data

  • Design and deploy highly visual reports and interactive user interfaces that surface quantitative insights in forms that are fit-for-purpose, modern and easily accessible

  • Stay current with the latest advancements in LLMs, NLP, Deep Learning and ML research, implementing cutting edge techniques and incorporating them into production models as appropriate

  • Document development processes, codebase, and best practices to facilitate knowledge sharing and maintain a well-organized, reproducible environment

  • Partner with other technical resources to refine data pipelines for recurring classes of analysis and data-driven solutions

  • Handle projects on request under the direction of the CKIO, Director of Data Analytics and other executive staff

Required Skills

  • Highly proficient with statistical programming (e.g., Python, R) and databases (e.g., SQL, Pinecone)

  • Proven experience developing and validating linear and non-linear regression and classification models

  • Expertise in data transformation, data science and visualization libraries (e.g., pandas, scikit-learn, matplotlib, Snorkel, Seaborn)

  • Experience with natural language processing and related libraries (e.g., Hugging Face's Transformers, spaCy, NLTK, and CoreNLP) preferred

  • Ability to design and develop object-oriented machine learning systems beyond Jupyter notebooks a plus

  • Solid understanding of deep learning frameworks such as TensorFlow or PyTorch a plus

  • Proficiency with version control systems such as Git or equivalent tools for code management and collaboration

  • Able to translate business problems to technical logic and practical solutions

  • Able to communicate complex results clearly to a non-technical audience

  • Proactively develops and maintains technical knowledge in emerging data science areas

  • Experience in the legal field is a significant plus

Required Experience

  • 2+ year in a data science, machine learning engineering, artificial intelligence or equivalent role

Required Education

  • A bachelor's degree required, preferably in data science, mathematics, statistics, computer science, engineering, finance or a related field

Preferred Education

  • Master's degree in data science, computer science, statistics, computational linguistics or engineering preferred

  • Prior coursework in deep learning, natural language processing, or information retrieval a significant plus

Details

Salary Information

NY only: The estimated base salary range for this position is $145k to $165k at the time of posting.

The actual salary offered will depend on a variety of factors, including without limitation, the qualifications of the individual applicant for the position, years of relevant experience, level of education attained, certifications or other professional licenses held, and if applicable, the location in which the applicant lives and/or from which they will be performing the job. This role is exempt meaning it is not overtime pay eligible.

Privacy Notice

For information about how Simpson Thacher & Bartlett LLP collects and processes your personal information, please refer to our Privacy Notice available at https://www.stblaw.com/other/privacy-notice.

#LI-Hybrid

DirectEmployers