Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Po Hu; Minlie Huang; Xiaoyan Zhu

doi:10.1109/TST.2013.6574672

| Sign up

PDF (513 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Po Hu, Minlie Huang(), Xiaoyan Zhu

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Show Author Information

Abstract

Patents are critically important for a company to protect its core business concepts and proprietary technologies. Effective patent mining in massive patent databases not only provides business enterprises with valuable insights to develop strategies for research and development, intellectual property management, and product marketing, but also helps patent offices to improve efficiency and optimize their patent examination processes. This paper describes the patent mining problem of automatically discovering core patents (i.e., novel and influential patents in a domain). In addition, the value of core patent mining is illustrated by revealing the potential competitive relationships among companies in their core patents. The work addresses the unique patent vocabulary usage which is not considered in traditional word-based statistical methods with a topic-based temporal mining approach that quantifies a patent’s novelty and influence through topic activeness variations. Tests of this method on real-world patent portfolios show the effectiveness of this approach over state-of-the-art methods.

Keywords

text mining core patent patent novelty patent influence company competitor

References

[1]

Kappos

, Innovation Policy and the Economy. Cambridge, MA, USA: National Bureau of Economic Research, 2010.

[2]

Institute for Prospective Technological Studies, The 2011 EU Industrial R&D Investment Scoreboard. Brussels, Belgium: European Commission’s Joint Research Centre, 2011.

[3]

Edward

, Patent mining in a changing world of technology and product development, Intellectual Asset Management, pp. 7-10, July/Aug. 2003.

[4]

Keraan

, Extracting Maximum Value from Intellectual Assets. New York City, USA: Deloitte & Touche, 2010.

[5]

USPTO, Fiscal year 2011 performance and accountability report, http://www.uspto.gov/about/ stratplan/ar/2011/index.jsp, 2011.

[6]

Google Patents, http://www.google.com/patents, 2013.

[7]

Delphion, http://www.delphion.com, 2013.

[8]

IPVision, http://www.see-the-forest.com/G4/Main.act, 2013.

[9]

Aureka, http://aureka.micropat.com, 2013.

[10]

Liu

, P.

Hseuh

, R.

Lawrence

, S.

Meliksetian

, C.

Perlich

, and A.

Veen

, Latent graphical models for quantifying and predicting patent quality, in Proc. of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 1145-1153.

Crossref

[11]

Jin

, S.

Spangler

, Y.

Chen

, K.

Cai

, R.

, L.

Zhang

, X.

, and J.

Han

, Patent maintenance recommendation with patent information network model, in Proc. of 11th IEEE International Conference on Data Mining, 2011, pp. 280-289.

Crossref

[12]

Wang

, M.

Chu

, and J. Z.

Shyu

, Patent value measurement by analytic hierarchy process, in Proc. of 15th International Conference on Management of Technology, 2006, pp. 1-12.

[13]

R. J.

Mann

and M.

Underweiser

, A new look at patent quality: Relating patent prosecution to validity, Journal of Empirical Legal Studies, vol. 9, no. 1, pp. 1-32, 2012.

Crossref Google Scholar

[14]

Guo

and C.

Gomes

, Ranking structured documents: A large margin based approach for patent prior art search, in Proc. of 21st International Joint Conference on Artificial Intelligence, 2009, pp. 1058-1064.

[15]

Xue

and W. B.

Croft

, Transforming patents into prior-art queries, in Proc. of 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009, pp. 808-809.

Crossref

[16]

Azzopardi

, W.

Vanderbauwhede

, and H.

Joho

, Search system requirements of patent analysts, in Proc. of 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 775-776.

Crossref

[17]

NTCIR, http://research.nii.ac.jp/ntcir/index-en.html, 2013.

[18]

CLEF-IP, http://www.ir-facility.org/clef-ip, 2013.

[19]

PaIR, http://www.ifs.tuwien.ac.at/pair2011/Site/PaIR11.html, 2013.

[20]

TREC-CHEM, http://www.ir-facility.org/trec-chem, 2013.

[21]

M. A.

Hasan

, W. S.

Spangler

, T.

Griffin

, and A.

Alba

, COA: Finding novel patents through text analysis, in Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 1175-1184.

Crossref

[22]

Chen

, S.

Spangler

, J.

Kreulen

, X.

, and L.

Zhang

, SIMPLE: A strategic information mining platform for licensing and execution, in Proc. of 9th IEEE International Conference on Data Mining Workshops, 2009, pp. 270-275.

Crossref

[23]

Shaparenko

, R.

Caruana

, J.

Gehrke

, and T.

Joachims

, Identifying temporal patterns and key players in document collections, in Proc. of 5th IEEE International Conference on Data Mining Workshops, 2005, pp. 165-174.

[24]

S. M.

Gerrish

and D. M.

Blei

, A language-based approach to measuring scholarly impact, in Proc. of 27th International Conference on Machine Learning, 2010, pp. 375-382.

[25]

Shaparenko

and T.

Joachims

, Information genealogy: Uncovering the flow of ideas in non-hyperlinked document databases, in Proc. of 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007, pp. 619-628.

Crossref

[26]

Bao

, R.

, Y.

, and Y.

Cao

, Competitor mining with the web, IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 10, pp. 1297-1310, 2008.

Crossref Google Scholar

[27]

, G.

Pant

, and O. R. L.

Sheng

, Mining competitor relationships from online news: A network-based approach, Electronic Commerce Research and Applications, vol. 10, no. 4, pp. 418-427, 2011.

Crossref Google Scholar

[28]

, L.

Wang

, and C.

Hong

, Extracting the significant-rare keywords for patent analysis, Expert Systems with Applications, vol. 36, no. 3, pp. 5200-5204, 2009.

Crossref Google Scholar

[29]

Chen

, Patent claim construction: An appeal for Chevron deference, Virginia Law Review, vol. 94, no. 5, pp. 1165-1212, 2008.

Google Scholar

[30]

Kotov

, C.

Zhai

, and R.

Sproat

, Mining named entities with temporally correlated bursts from multilingual web news streams, in Proc. of 4th ACM International Conference on Web Search and Data Mining, 2011, pp. 237-246.

Crossref

[31]

Newman

, A.

Asuncion

, P.

Smyth

, and M.

Welling

, Distributed algorithms for topic models, Journal of Machine Learning Research, vol. 10, pp. 1801-1828, 2009.

Google Scholar

[32]

Fischer

and K.

Meier-Hellstern

, The Markov-modulated Poisson process cookbook, Performance Evaluation, vol. 18, no. 2, pp. 149-171, 1993.

Crossref Google Scholar

[33]

A. P.

Dempster

, N. M.

Laird

, and D. B.

Rubin

, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1-38, 1977.

Crossref Google Scholar

[34]

Large petroleum companies listed by Wikipedia, http://en.wikipedia.org/wiki/List_of_oil_exploration_and_ production_companies and http://en.wikipedia.org/wiki/List_of_oilfield_service_companies, 2013.

[35]

USPTO database, http://patft.uspto.gov, 2013.

[36]

A. K.

McCallum

, MALLET: A machine learning for language toolkit, http://mallet.cs.umass.edu, 2002.

[37]

The official USPTO gazettes, http://www.uspto.gov/ news/og/index.jsp, 2013.

[38]

Lin

and E.

Hovy

, Automatic evaluation of summaries using N-gram co-occurrence statistics, in Proc. of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 2003, pp. 71-78.

Crossref

[39]

Yahoo! Finance, http://finance.yahoo.com/q/co?s=MSFT, 2013.

[40]

D. M.

Blei

and J. D.

Lafferty

, Dynamic topic models, in Proc. of 23rd International Conference on Machine Learning, 2006, pp. 113-120.

Crossref

[41]

Wang

and A.

McCallum

, Topics over time: A non-Markov continuous-time model of topical trends, in Proc. of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 424-433.

Crossref

Tsinghua Science and Technology

Volume 18 Issue 4,
August 2013

Pages 339-352

DOI: 10.1109/TST.2013.6574672

Cite this article:

Hu P, Huang M, Zhu X. Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications. Tsinghua Science and Technology, 2013, 18(4): 339-352. https://doi.org/10.1109/TST.2013.6574672