Xiaoyan Lin

Welcome to my home page.

Name
Xiaoyan Lin
Email
linxiaoyan18(AT)gmail(DOT)com

I was a Software Engineer in Microsoft (China) Co. Ltd. between May 2015 to Oct 2015, working on keyword selection in Ad Understanding team in Bing Ads. Before that, I was a Software Engineer in Yahoo! (Beijing) R&D center between July 2014 to March 2015, working in Advertising&Data (AD&D) team.

Before joining Yahoo!, I achieved my PhD degree in July 2014 in Peking University. I researched on mathematical rormula recognition and retrieval from PDF Documents as my PhD thesis. My advisors in Peking University are Prof. Zhi Tang and Dr. Liangcai Gao.

Besides, I'm also an active contributor to an open source project, Netty. Find my commits here.

2015.5--2015.10 Software Engineer, Ads Understanding, Ads Understanding, Bing Ads, Microsoft (China) Co. Ltd., Beijing, China.
2014.7--2015.3 Software Development Engineer, Advertising&Data, Yahoo! (Beijing) R&D center, Beijing, China.
2009.9--2014.7 Ph.D, Peking University, Institute of Computer Science & Technology, Beijing, China.

Thesis: Research on Mathematical Formula Recognition and Retrieval from PDF Documents.

2012.9--2013.3 Visiting Ph.D. student, University of Birmingham, School of Computer Science, Birmingham, UK.

Integrated mathematical formula identification into Maxtract and improved text line detection.

2005.9--2009.7 Bachelor of Computer & Science, Beijing Normal University, Institute of Information Science & Technology, Beijing, China
2009.9--2014.7 Ph.D, Peking University, Institute of Computer Science & Technology, Beijing, China.
2005.9--2009.7 Bachelor of Computer & Science, Beijing Normal University, Institute of Information Science & Technology, Beijing, China

[My google scholar citations]

Journal papers

  1. Xiaoyan Lin, Liangcai Gao, Zhi Tang, Josef B. Baker and Volker Sorge, " Mathematical formula identification and performance evaluation in PDF documents", Int'l. Journal on Document Analysis and Recognition (IJDAR), Volume 17, Issue 3, pp 239-255, September 2014.

  2. Xiaoyan Lin, Liangcai Gao, Zhi Tang, "Research on mathematical formula identification in digital Chinese documents", Acta Scientiarum Naturalium Universitatis Pekinensis (ASNUP), Vol.50, No.1, pp. 17-24, 2014.

Conference papers

  1. Xiaoyan Lin, Liangcai Gao, Xuan Hu, Zhi Tang, Yingnan Xiao, Xiaozhong Liu, "A Mathematics Retrieval System for Formulae in Layout Presentations," The 37th Annual ACM SIGIR (Special Interest Group On Information Retrieval) Conference, pp. 697-706, 2014. [System]

  2. Xiaoyan Lin, Liangcai Gao, Zhi Tang, Josef Baker, Mohamed Alkalai and Volker Sorge, "A Text Line Detection Method for Mathematical Formula Recognition," Proc. Int. Conf. Document Analysis and Recognition (ICDAR), pp. 339-343, 2013.

  3. Mohamed Alkalai, Josef B. Baker, Volker Sorge and Xiaoyan Lin, "Improving Formula Analysis with Line and Mathematics Identification," Proc. Int. Conf. Document Analysis and Recognition (ICDAR), pp. 334-338, 2013.

  4. Xuan Hu, Liangcai Gao, Xiaoyan Lin, Zhi Tang, Xiaofan Lin and Josef Baker, "WikiMirs: A Mathematical Information Retrieval System for Wikipedia", 13th ACM/IEEE-Computer Science Joint Conference on Digital Libraries (JCDL), pp.11-20, 2013.

  5. Liangcai Gao, Zhi Tang, Xiaoyan Lin and Yongtao Wang, ""A Graph-Based Method of Newspaper Article Reconstruction," Proc. of the 21st International Conference on Pattern Recognition (ICPR), pp.1566-1569, 2012.

  6. Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin and Xuan Hu, "Performance Evaluation of Mathematical Formula Identification," Proc. of the 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 287-291, 2012. [Dataset]

  7. Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xuan Hu and Xiaofan Lin, "Identification of embedded mathematical formulas in PDF documents using SVM," Proc. of SPIE-IS&T. Document Recognition and Retrieval (DRR) XIX, pp. 8297 0D 1-8, 2012.

  8. Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin and Xuan Hu, "Mathematical formula identification in PDF documents," Proc. Int. Conf. Document Analysis and Recognition (ICDAR), pp. 1419-1423, 2011.

Other groups

I'm aware of:
  • SDAG: Scientific Document Analysis Group from Birmingham University. Specially, I am interested in mathematical formula recognition work by Josef Baker. Maxtract is a tool developped by SDAG for converting PDF into formats such as LaTeX, MathML and text.
  • DPRL: The Document and Pattern Recognition Lab (DPRL) from Rochester Institute of Technology. Specially, I am interested in mathematical formula recognition work by Richard Zanibbi.
  • Infty: InftyProject is a voluntary R&D organization consisting of researchers from different universities and research institutes in Japan. Specially, I am interested in mathematical formula recognition softwares and databases released by Infty.

Other systems

MIR - Mathematical Information Retrieval:
  • WikiMirs: A mathematical formula retrieval system towards Wikipedia, supporting structural matching between mathmatical formulae. This tool is developed by my colleague Xuan Hu.
  • LaTeXSearch: released by Springer. It affords the scientific community the ability to search for LaTeX code within scientific publications.
  • MathWebSearch: a mathematical formula retrieval system released by Jacobs University.
  • DLMF: The Digital Library of Mathematical Functions (DLMF) Project an electronic version of Abramowitz and Stegun's Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables.
  • EgoMath: EgoMath is a full text mathematical search engine intended for digital mathematical content. Now it enables mathematical searching in one of the world's largest digital libraries - Wikipedia.org.
  • MIaS: MIaS (Math Indexer and Searcher) is a math aware, full-text based search engine.
  • Programming languages: C/C++, C#, Java, OCaml, Scala, Python, Shell
  • IDE&Tools: IntelliJ IDEA, Eclipse, Visual Studio,Vim, Git, Weka, OpenCV
  • Data processing: Hadoop, MapReduce, Pig, HBase, Scope
  • Languages: Fluent English (IELTS-Band 7), native Mandarin, fluent Cantonese

If you would like to know more about me, you can

add me on Facebook,

circle me on Google+,

connet me on LinkedIn,

view my GitHub profile,

follow me on Twitter or Weibo,