Hey, I'm
BINGYING LIU :)
Data Science/Visualizer/Music Fanatic

Wells Fargo

NLP Data Scientist Associate, July 2020 till now

Delivered data science and Natural Language Processing (NLP) solutions in partnership with enterprise-wide line of business to address target objective and success outcome via agile ceremony.

Developed a set of productionalized machine learning models to identify sufficiently documented advisor notes through reproducible experimentation with various embedding representations and algorithm combinations. Leveraged academic research to justify modeling decisions for model validation and designed monitoring KPIs to reveal data drifts and possibly unreliable modeling results.

Developed robust data pipelines for modeling and production population with various impacts considered.

Designed NLP Annotator Training Procedure that standardizes the workflow for gold label collection, annotation tool setup and official annotator training, which are used in the initial stage of label collection.

Composed model development documentation to communicate algorithm methodology and assumptions, data pipeline impacts and model validation.

Toucan AI

Machine Learning Intern, May to Aug 2019

Worked on grammar correction to identify and correct spelling and broken sentences for consumers’ input using rule-based and neural-based algorithms.

Designed analytics metrics and developed a dashboard using React for e-commerce to identify popular products and observe potential customers’ sentiments.

Experimented with state of the art academic research in NLP to perform non-span-based question answering, which improved chatbot's QA ability.

Image Characterization for Purple Wave Auction

Project, Sep 2019 - Mar 2020

Purple Wave is a leading, no-reserve equipment auction platform for construction and farm equipment, who wants to know the influence of equipment images to final winning bid of an auction item.

Extracted interpretable features from auction images: end-to-end price prediction model with saliency map to interpret neural net, transfer learning, large-annotation-based approach (MTurk), etc. to improve upon the baseline model.

Supported business strategy making by achieving $2600 mean absolute error on price prediction tasks (typical price is $20000).

Built an dockerized ML interpretation app for our model.

Real-time Dashboard for Nascar

Project, May 2019

Developed metrics to estimate oil consumption based on track conditions for individual players (Bayesian updating model and feature engineering using historical data).

Designed a set of Tableau dashboards to help pitstop member decide perfect timing to pitstop and compared different player’s racing strategies.

Diabetes Mobile App

Project, Feb 2019

Developed strategies and corresponding mobile-app wireframe to help type 2 diabete patients make use of self-generated diabetes-related data and self-manage their conditions in real-time.

Strategies:
1) Create dynamic visualizations on HbA1c estimate, blood glucose level, weight change and step count for patients.
2) Generate weekly report to encourage patients for their current process.
3) Other functions such as checking past and future appointments with doctors, medicine info, alarm when any measurements level exceeds certain thresholds, etc.

Emotion Perception in Music

Project, Dec 2018

This project explores the effects of factors inherent in music (mode, temple, timbre, etc) that contribute to music emotional expressions such as happy, sad, scary and peaceful, although the emotions collected might be subjective. Correlation, exploratory and residual plots are drawn to figure out the best model to use. Finally, multiple linear regression model with log encoding and quadratic terms was chosen as the best fitting model and was able to predict emotional scores given single set of primary cues.

Machine Learning projects

Projects, Sep 2018 - Mar 2020

1) Music Genre Classification with Parallel CRNN
Abstract: Music genre classification is consisted of two main components, audio features and classifier. This paper implemented a classical long short term memory (LSTM) model on music genre classification on GTZAN dataset and proposed a parallel CNN and RNN model to improve the performance by using various kinds of audio features. After trials on segmentation of the audio files, the proposed Parallel CNN and RNN model reached an accuracy of 72.5%.

2) Music Timbre Style Transfer with Nsynth dataset
Abstract: Image style transfer using convolutional neural networks is a popular example to demonstrate the capabilities of machine learning algorithms, with the goal of taking the style of one image and applying it to the content of another. Our focus in this project is to extend this approach to audio generation. In the domain of music, several types of style transfer have been proposed, including composition style, performance style, and timbre style transfer. Here, we will focus on timbre style transfer by extracting the spectrogram from two inputs and performing image style transfer methods using convolutional neural networks.

About me

Hi! I'm Bingying Liu. I design, build and deploy NLP solutions and machine learning systems that benefits downstream users at scale.

Currently I'm an NLP data scientist at Wells Fargo focusing on delivering NLP solutions for risk and compliance use cases. My work is composed of collecting data, building interpretable models, deloying and monitoring models for potential retrain. I also drive and contribute to shared resources and best practices within the center of excellence.

Outside of work, I'm a huge foodie (Atlanta spoils me) and love music, sketches, meditation and hiking. I'm also into design and building things from scratch!

Contact me