Wei FANG (+1) 4122779305 |  weifang.cs@gmail.com
EDUCATION
Carnegie Mellon University Pittsburgh, PA
M.S. IN ELECTRICAL AND COMPUTER ENGINEERING Aug. 2015 - May 2017
• Courses: Search Engine, Cloud Computing, Machine Learning, Deep Learning, How To Write Fast Code, Machine
Translation, Speech Recognition
Sun Yat-sen University Guangzhou, China
B.ENG. IN SOFTWARE ENGINEERING Sept. 2011 - June 2015
• Honors: National Scholarship, Google Scholarship, Outstanding Winner in the Software Innovation Contest in SYSU
SKILLS & INTERESTS
Skills C/C++, Python, Java, C#, Cuda, Hadoop, MySQL, HBase, Web Development
Interests Machine Learning, Natural Language Processing, Cloud Computing
INTERN EXPERIENCE
Microsoft Research Asia Beijing, China
INTERN IN MACHINE LEARNING GROUP July 2014 - Sept. 2015
• Design models to train vector representations for words and named entities jointly with Wikipedia and Freebase
• Apply trained embeddings to tackle the problem of entity disambiguation and knowledge completion
• Receive the Award of Excellence in the Stars of Tomorrow Internship Program
PROJECTS
Entity Disambiguation by Knowledge and Text Jointly Embedding (C++, C#) Aug. 2015
• Extensively utilize cloud platform to pre-process data from Wikipedia, Freebase and other data sources
• Design models and train low-dimensional continuous vector representations for words and entities by jointly em-
bedding knowledge base (Freebase) and text (Wikipedia) to a same vector space
• Design features based on trained embeddings for entity disambiguation
• Build a frontend-backend system to demonstrate the work with MVC design
Search Engine Implementation (Java) Sept. 2016
• Implement a search engine with exact-match retrieval model(Ranked Boolean) and best-retrieval model(BM25, Indri)
• Integrate the pseudo relevance feedback for query expansion
• Apply learning-to-rank techniques for feature-based search
Speech Recognition System Implementation (C++) Dec. 2015
• Implement a GMM-HMM based system from scratch to recognize continuous spoken digits (such as phone numbers)
• Conduct experiments on a portion of the “Aurora2” corpus with 8400 training recordings and 1000 testing recordings,
achieving an accuracy above 92%
Machine Reading with Recurrent Neural Network and Attention Mechanism (Lua, Torch) Apr. 2016
• Design and train a neural network to read passages and answer questions with Torch
• Apply bi-directional LSTM to read passages and questions
• Utilize attention mechanism to locate supporting facts in a passage based on a question
• Combine the GRU and the attention mechanism to retain order and position information between supporting facts
Accelerate Speaker Verification System with GPU (C++, Cuda) June 2016
• Analyse the most compute-intensive component(i-vector extraction) in a speaker verification system
• Transfer the i-vector extraction component from CPU platform to GPU platform
• Take advantages of multiple GPUs to further accelerate the system
Twitter Analytics Web Service on the Cloud (Python, Java, AWS) Oct. 2016
• Design and implement a frontend-backend system for tweets analytics on AWS
• Process 1TB tweets data with MapReduce ETL on AWS EMR, store data in MySQL and HBase
• Design storage schema to improve the performance of MySQL and HBase
Cross Platform File Sharing and Presentation System (Python, Java, Javascript, Android) Dec. 2013
• Propose and develop an application for users to share and present their file everywhere
• Connect the lecturer and audiences together with a central server and enable the lecturer to mark and turn pages
while presenting files, which will be broadcast to the browser of audiences in real time with the WebSocket protocol
• Develop an Android client for lecturers to upload files, accept audiences and present their work
PUBLICATION
Entity Disambiguation by Knowledge and Text Jointly Embedding CoNLL 2016
WEI FANG, JIANWEN ZHANG, DILIN WANG, ZHENG CHEN, AND MING LI

Wei Fang's resume

  • 1.
    Wei FANG (+1)4122779305 |  [email protected] EDUCATION Carnegie Mellon University Pittsburgh, PA M.S. IN ELECTRICAL AND COMPUTER ENGINEERING Aug. 2015 - May 2017 • Courses: Search Engine, Cloud Computing, Machine Learning, Deep Learning, How To Write Fast Code, Machine Translation, Speech Recognition Sun Yat-sen University Guangzhou, China B.ENG. IN SOFTWARE ENGINEERING Sept. 2011 - June 2015 • Honors: National Scholarship, Google Scholarship, Outstanding Winner in the Software Innovation Contest in SYSU SKILLS & INTERESTS Skills C/C++, Python, Java, C#, Cuda, Hadoop, MySQL, HBase, Web Development Interests Machine Learning, Natural Language Processing, Cloud Computing INTERN EXPERIENCE Microsoft Research Asia Beijing, China INTERN IN MACHINE LEARNING GROUP July 2014 - Sept. 2015 • Design models to train vector representations for words and named entities jointly with Wikipedia and Freebase • Apply trained embeddings to tackle the problem of entity disambiguation and knowledge completion • Receive the Award of Excellence in the Stars of Tomorrow Internship Program PROJECTS Entity Disambiguation by Knowledge and Text Jointly Embedding (C++, C#) Aug. 2015 • Extensively utilize cloud platform to pre-process data from Wikipedia, Freebase and other data sources • Design models and train low-dimensional continuous vector representations for words and entities by jointly em- bedding knowledge base (Freebase) and text (Wikipedia) to a same vector space • Design features based on trained embeddings for entity disambiguation • Build a frontend-backend system to demonstrate the work with MVC design Search Engine Implementation (Java) Sept. 2016 • Implement a search engine with exact-match retrieval model(Ranked Boolean) and best-retrieval model(BM25, Indri) • Integrate the pseudo relevance feedback for query expansion • Apply learning-to-rank techniques for feature-based search Speech Recognition System Implementation (C++) Dec. 2015 • Implement a GMM-HMM based system from scratch to recognize continuous spoken digits (such as phone numbers) • Conduct experiments on a portion of the “Aurora2” corpus with 8400 training recordings and 1000 testing recordings, achieving an accuracy above 92% Machine Reading with Recurrent Neural Network and Attention Mechanism (Lua, Torch) Apr. 2016 • Design and train a neural network to read passages and answer questions with Torch • Apply bi-directional LSTM to read passages and questions • Utilize attention mechanism to locate supporting facts in a passage based on a question • Combine the GRU and the attention mechanism to retain order and position information between supporting facts Accelerate Speaker Verification System with GPU (C++, Cuda) June 2016 • Analyse the most compute-intensive component(i-vector extraction) in a speaker verification system • Transfer the i-vector extraction component from CPU platform to GPU platform • Take advantages of multiple GPUs to further accelerate the system Twitter Analytics Web Service on the Cloud (Python, Java, AWS) Oct. 2016 • Design and implement a frontend-backend system for tweets analytics on AWS • Process 1TB tweets data with MapReduce ETL on AWS EMR, store data in MySQL and HBase • Design storage schema to improve the performance of MySQL and HBase Cross Platform File Sharing and Presentation System (Python, Java, Javascript, Android) Dec. 2013 • Propose and develop an application for users to share and present their file everywhere • Connect the lecturer and audiences together with a central server and enable the lecturer to mark and turn pages while presenting files, which will be broadcast to the browser of audiences in real time with the WebSocket protocol • Develop an Android client for lecturers to upload files, accept audiences and present their work PUBLICATION Entity Disambiguation by Knowledge and Text Jointly Embedding CoNLL 2016 WEI FANG, JIANWEN ZHANG, DILIN WANG, ZHENG CHEN, AND MING LI