You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+30-6Lines changed: 30 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,9 +26,9 @@ $a = $dom->find('a')[0];
26
26
echo $a->text; // "click here"
27
27
```
28
28
29
-
The above will output "click here". Simple no? There are many ways to get the same result from the dome, such as $dom->getElementsbyTag('a')[0] or $dom->find('a', 0) which can all be found in the tests or in the code itself.
29
+
The above will output "click here". Simple no? There are many ways to get the same result from the dome, such as `$dom->getElementsbyTag('a')[0]` or `$dom->find('a', 0)` which can all be found in the tests or in the code itself.
30
30
31
-
Example With Files
31
+
Loading Files
32
32
------------------
33
33
34
34
You may also seamlessly load a file into the dom instead of a string, which is much more convinient and is how I except most developers will be loading the html. The following example is taken from our test and uses the "big.html" file found there.
@@ -57,9 +57,9 @@ foreach ($contents as $content)
57
57
58
58
This example loads the html from big.html, a real page found online, and gets all the content-border classes to process. It also shows a few things you can do with a node but it is not an exhaustive list of methods that a node has avaiable.
59
59
60
-
Alternativly, you can always use the load() method to load the file. It will attempt to find the file using file_exists and, if succesfull, will call loadFromFile() for you. The same applies to a URL and loadFromUrl() method.
60
+
Alternativly, you can always use the `load()` method to load the file. It will attempt to find the file using `file_exists` and, if succesfull, will call `loadFromFile()` for you. The same applies to a URL and `loadFromUrl()` method.
61
61
62
-
Example With Url
62
+
Loading Url
63
63
----------------
64
64
65
65
Loading a url is very similar to the way you would load the html from a file.
$html = $dom->outerHtml; // same result as the first example
77
77
```
78
78
79
-
What makes the loadFromUrl method note worthy is the PHPHtmlParser\CurlInterface parameter, an optional second parameter. By default, we use the PHPHtmlParser\Curl class to get the contents of the url. On the other hand, though, you can inject your own implementation of CurlInterface and we will attempt to load the url using what ever tool/settings you want, up to you.
79
+
What makes the loadFromUrl method note worthy is the `PHPHtmlParser\CurlInterface` parameter, an optional second parameter. By default, we use the `PHPHtmlParser\Curl` class to get the contents of the url. On the other hand, though, you can inject your own implementation of CurlInterface and we will attempt to load the url using what ever tool/settings you want, up to you.
80
80
81
81
```php
82
82
use PHPHtmlParser\Dom;
@@ -87,7 +87,31 @@ $dom->loadFromUrl('http://google.com', new Connector);
87
87
$html = $dom->outerHtml;
88
88
```
89
89
90
-
As long as the Connector object implements the PHPHtmlParser\CurlInterface interface properly it will use that object to get the content of the url instead of the default PHPHtmlParser\Curl class.
90
+
As long as the Connector object implements the `PHPHtmlParser\CurlInterface` interface properly it will use that object to get the content of the url instead of the default `PHPHtmlParser\Curl` class.
91
+
92
+
Options
93
+
-------
94
+
95
+
You can also set parsing option that will effect the behavior of the parsing engine. You can set a global option array using the `setOptions` method in the `Dom` object or a instance specific option by adding it to the `load` method as an extra (optional) parameter.
96
+
97
+
```php
98
+
use PHPHtmlParser\Dom;
99
+
100
+
$dom = new Dom;
101
+
$dom->setOptions([
102
+
'strict' => true, // Set a global option to enable strict html parsing.
103
+
]);
104
+
105
+
$dom->load('http://google.com', [
106
+
'whitespaceTextNode' => false, // Only applies to this load.
107
+
]);
108
+
109
+
$dom->load('http://gmail.com'); // will not have whitespaceTextNode set to false.
110
+
```
111
+
112
+
At the moment we support 2 options, strict and whitespaceTextNode. Strict, by default false, will throw a `StrickException` if it find that the html is not strict complient (all tags must have a clossing tag, no attribute with out a value, etc.).
113
+
114
+
The whitespaceTextNode, by default true, option tells the parser to save textnodes even if the content of the node is empty (only whitespace). Setting it to false will ignore all whitespace only text node found in the document.
0 commit comments