In natural language processing, information extraction (IE) is a type of information retrieval whose goal is to automatically extract structured information, i.e. categorized and contextually and semantically well-defined data from a certain domain, from unstructured machine-readable documents. An example of information extraction is the extraction of instances of corporate mergers, more formally MergerBetween(company1,company2,date), from an online news sentence such as: "Yesterday, New-York based Foo Inc. announced their acquisition of Bar Corp." A broad goal of IE is to allow computation to be done on the previously unstructured data. A more specific goal is to allow logical reasoning to draw inferences based on the logical content of the input data.

The significance of IE is determined by the growing amount of information available in unstructured (i.e. without metadata) form, for instance on the Internet. This knowledge can be made more accessible by means of transformation into relational form, or by marking-up with XML tags. An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with.

A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. Current approaches to IE use natural language processing techniques that focus on very restricted domains. For example, the Message Understanding Conference (MUC) is a competition-based conference that focused on the following domains in the past:

  • MUC-1 (1987), MUC-2 (1989): Naval operations messages.
  • MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries.
  • MUC-5 (1993): Joint ventures and microelectronics domain.
  • MUC-6 (1995): News articles on management changes.
  • MUC-7 (1998): Satellite launch reports.

Natural Language texts may need to use some form of a Text simplification to create a more easily machine readable text to extract the sentences.

Typical subtasks of IE are:

  • Content Noise Removal: remove noise contents. For example, tagclouds, navigational menu, related contents, and context related advertisements.
  • Named Entity Recognition: recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions.
  • Coreference resolution: detection of coreference and anaphoric links between text entities. In IE tasks, this is typically restricted in finding links between previously extracted named entities. For example, "International Business Machines" and "IBM" refer to the same real world entity.
  • Terminology extraction: finding the relevant terms for a given corpus
  • Relationship Extraction: identification of relations between entities, such as:
    • PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.")
    • PERSON located in LOCATION (extracted from the sentence "Bill is in France.")

From Wikipedia under the GNU Free Documentation License
Sun Feb 14 08:29:50 2010

Can Stun Guns Be used Effectively for Torture, Information Extraction, and Execution, looking like an Accident
Q. I think that these new so called non-lethal weapons have a great potential for abuse by police. Police can over use these weapons on civilians for sinister purposes like racial hatred, sadistic cruelty. I think police gain great pleasure from sadistic acts of cruelty inflicted on people. I have read many articles of police laughing after seriously hurting people. Are police agencies a type of sick S & M like de Sade. I think that the police get some type of adrenalin rush. A powerful endorfin high from the thrills of high speed chases and torturing and killing people and destroying peoples lives. Endorphins are a powerful heroin like high. I bet if you did a scientific experiment and monitored police endorphine levels you'd be… [cont.]
Asked by God Save America - Sat Sep 23 14:58:48 2006 - - 9 Answers - 0 Comments

A. Don't be ridiculous!
Answered by leckie1UK - Wed Sep 27 18:26:26 2006

information extraction and retrieval on text mining and image mining?
Q. information extraction and retrieval on text mining and image mining?
Asked by arjun - Tue Aug 8 06:59:13 2006 - - 1 Answers - 0 Comments

A. U just follow the link good luck
Answered by dewman_byju - Tue Aug 8 07:23:30 2006

where can I find informations about extraction of material such as herbs extractions?
Q. I need to know the chemical materials that will help me to extract some kind of herbs such as sage , thym,
Asked by may_daouk - Sun Apr 16 13:26:30 2006 - - 1 Answers - 0 Comments

A. Not All Oils Are Created Equal Some plants, like rose and jasmine, contain very little essential oil. Their important aromatic properties are extracted using a chemical solvent. The end product, known as an absolute, contains essential oil along with other plant constituents. Though not a true essential oil, absolutes are commonly used for fragrancing cosmetic products like fine perfumes. There are also significant differences between synthetic fragrance oils (made possible by recent advances in chemistry) and pure essential oils. Synthetic fragrance oils are produced by blending aromatic chemicals primarily derived from coal tar. These oils may duplicate the smell of the pure botanical, but the complex chemical components of each… [cont.]
Answered by Mrs. L - Thu Apr 20 15:17:56 2006

From Yahoo Answer Search: "information extraction"
Tue Dec 22 23:41:14 2009

Pocono Bulletin Board: Tuesday, March 2 - Pocono Record
news.google.com
Pocono Bulletin Board: Tuesday, March 2

Pocono Record

The League of Women Voters of Monroe County will host a forum to discuss the Marcellus shale natural gas extraction from 10 am to noon Saturday, ...
Parsing fact from fiction with the Bloom Energy box - CNET
news.google.com
Parsing fact from fiction with the Bloom Energy box

CNET

Currently it is cheap because of the new hydraulic fracturing technology produced by Haliburton that's allowing the extraction of natural gas from shale gas ...



and more »
NaturalNano Announces Letter of Intent to Acquire Majority Interest in ... - MarketWatch (press release)
news.google.com
NaturalNano Announces Letter of Intent to Acquire Majority Interest in ...

MarketWatch (press release)

... know-how for extraction and separation processes, compositions, and derivatives of Halloysite. Visit www.naturalnano.com for more information . ...



and more »

From Google News Search: "information extraction"
Fri Mar 5 04:28:24 2010

mmdss07 yangarber cir Page 022 480 jpg
carbon.videolectures.net
mmdss07 yangarber cir Page 022 480 jpg
338px x 480px | 6.60kB

[source page]



simubistatic jpg
ensieta.fr
simubistatic jpg
400px x 431px | 31.70kB

[source page]



radarsillage jpg
ensieta.fr
radarsillage jpg
639px x 851px | 51.30kB

[source page]



From Yahoo Image Search: "information extraction"
Fri Mar 5 04:28:11 2010

Predicting Structured Data (Neural Information Processing ...
page2book.com
Predicting Structured Data (Neural Information Processing ...

admin

Sun, 29 Nov 2009 23:43:50 GM

The contributors discuss applications as diverse as machine translation, document markup, computational biology, and . information extraction. , among others, providing a timely overview of an exciting field. ...

Changeset 13803 Astrometry.net
trac.astrometry.net
Changeset 13803 Astrometry.net

hogg

Sat, 05 Dec 2009 20:48:12 GM

169, is worth saying a few words about optimal . extraction. . What we say. 170, here is largely in response to the Bolton \& Schlegel contribution,. 171, which is, to our knowledge, the most sophisticated treatment of the. 172, problem to date . ... 203, cannot carry forward covariance . information. , or the investigator is. 204, uninterested in marginalization​ over uncertainties. Neither of these. 205, conditions is met for the general spectroscopic user, and in ...

Purdue e-Pubs - Peter A. Bracken and John T. Dalton: The ...
docs.lib.purdue.edu
Purdue e-Pubs - Peter A. Bracken and John T. Dalton: The ...

Peter A. Bracken

Sat, 17 Oct 2009 01:52:52 GM

The AOIPS is an interactive, minicomputer- based processing and display system that is used primarily for image data analysis and . information extraction. operations within the Applications Directorate at NASA's Goddard Space Flight ...

From Google Blog Search: "information extraction"
Mon Dec 7 19:01:42 2009