README
Helper to extract data from DOM
Supported environments:
- Browsers
- PhantomJS
- Cheerio
- jsdom
Example
This example uses Cheerio:
var cheerio = require('cheerio');
var eee = require('dom-eee');
var html = '<ul><li>item1</li><li>item2 <span>with span</span></li></ul>';
var $ = cheerio.load(html);
var result = eee($.root(),
{
items: {
selector: 'li',
type: 'collection',
extract: { text: { selector: ':self' } },
filter: { exists: 'span' }
}
},
{ env: 'cheerio', cheerio: $ });
console.log(result);
Prints:
{ items: [ { text: 'item2 with span' } ] }
Extraction expressions
The system works by evaluating an object-formatted DSL expression. The syntax of the DSL and its semantics is described below.
ObjectExpression
:
{
"prop1": Expression,
"prop2": Expression
}
ObjectExpression
returns an object with given properties.
Property values are described by further a Expression
(s).
Expression
is either CollectionExpression
or
SingleExpression
, returning a value described by it.
CollectionExpression:
{
"type": "collection",
"selector": CSSSelector,
"extract": ObjectExpression,
"filter": FilterExpression
}
CollectionExpression
returns an array of items. Items are extracted
by applying extract
expression to each element matched by the
selector
CSS rule. If the rule matches no elements then an empty
array is returned.
Optionally, the filter
property might be set. Then the array of
raw elements is first filtered through the FilterExpression
.
SingleExpression
:
{
"type": "single",
"selector": CSSSelector,
"property": String,
"attribute": String,
"html": Boolean
}
Properties property
and attribute
are optional. If present
the extracted value is either a property or an attribute of the
node matched by the selector
. If html
property is set to
true
then element's markup is extracted. If not present,
the text contents of the element is returned. If selector
matches nothing then null is returned.
Property type
is optional. When not set, single
is assumed as
the default.
FilterExpression
:
{
"exists": CSSSelector
}
An element passes a FilterExpression
if it has elements that match
the CSS rule in the exists
property.
Testing
Run npm test
.
License
The MIT License.