我正在尝试构建一个脚本,它将进入任何给定wordpress站点的css文件,并检索主题信息。问题是-当我抓取页面时,所有的换行符都变成了空格,并且顺序总是不同的。例如:
/*
Theme Name: ColorWay
Theme URI: http://www.inkthemes.com/wp-themes/colorway-wp-theme/
Description: Colorway is Simple, Elegant, Responsive and beautiful Theme with Easy Customization Options built by InkThemes.com. The Customization Options includes using your own Logos, Backgrounds, Analytics and your own Custom Footer Texts and Analytics that can be tweaked using Theme Options Panel. Colorway Theme is Single Click Intall feature, Just press activate button and your website will get ready with all the dummy content. Just set the content from the Themes Options Panel. Colorway by InkThemes.com is suitable for any business or personal website. The Theme can work for various different niches. It includes special styles for Gallery pages, and has an optional fullwidth page template as well.
Author: InkThemes.com
Author URI: http://www.inkthemes.com
Version: 2.5.1
License: GNU General Public License
License URI: license.txt
Tags: black, blue, green, white, gray, custom-menu, dark, two-columns, fixed-width, custom-header, custom-background, threaded-comments, sticky-post, custom-colors, custom-header, custom-menu, light, theme-options, editor-style
*/这可能很简单,但是在抓取之后,我得到了以下结果:
/* Theme Name: ColorWay Theme URI: www.inkthemes. com/wp-themes/colorway-wp-theme/ Description: Colorway is Simple, Elegant, Responsive and beautiful Theme with Easy Customization Options built by InkThemes.com. The Customization Options includes using your own Logos, Backgrounds, Analytics and your own Custom Footer Texts and Analytics that can be tweaked using Theme Options Panel. Colorway Theme is Single Click Intall feature, Just press activate button and your website will get ready with all the dummy content. Just set the content from the Themes Options Panel. Colorway by InkThemes .com is suitable for any business or personal website. The Theme can work for various different niches. It includes special styles for Gallery pages, and has an optional fullwidth page template as well. Author: InkThemes.com Author URI: www. inkthemes. com Version: 2.5.1 License: GNU General Public License License URI: license.txt Tags: black, blue, green, white, gray, custom-menu, dark, two-columns, fixed-width, custom-header, custom-background, threaded-comments, sticky-post, custom-colors, custom-header, custom-menu, light, theme-options, editor-style */只有一段文字。你会怎么做呢?
编辑:
这是一个在代码不在一行的情况下可以工作的示例:
对不起,我以为你指的是我想刮掉的URL。
下面是一个在源代码上工作的示例,如果我只复制它:
$html = file_get_html('http://website-addons.net/wp-content/themes/powermag/style.css?ver=all');
preg_match("/Theme\sName:\s?(.+)/", $html, $themename);
preg_match("/Theme\sURI:\s?(.+?)\s/", $html, $uri);
preg_match("/Version:(\s?.+?)\s/", $html, $version);
preg_match("/Description:(.+)\s/", $html, $desc);
preg_match("/Author:(.+?)\s/", $html, $author);
echo $themename[1];现在这样做是行不通的。我只会得到一大堆代码。
发布于 2015-01-20 04:45:38
当然,我想单行和多行都可以。
# '/(?s)^(?=.*?\bTheme[ ]+Name:[ ]*(?<theme_name>(?&info)))?(?=.*?\bTheme[ ]+URI:[ ]*(?<theme_uri>(?&info)))?(?=.*?\bDescription:[ ]*(?<desc>(?&info)))?(?=.*?\bAuthor:[ ]*(?<author>(?&info)))?(?=.*?\bAuthor[ ]+URI:[ ]*(?<author_uri>(?&info)))?(?=.*?\bVersion:[ ]*(?<version>(?&info)))?(?=.*?\bLicense:[ ]*(?<license>(?&info)))?(?=.*?\bLicense[ ]+URI:[ ]*(?<license_uri>(?&info)))?(?=.*?\bTags:[ ]*(?<tags>(?&info)))?(?(1)|(?(2)|(?(3)|(?(4)|(?(5)|(?(6)|(?(7)|(?(8)|(?(9)|(?!))))))))))(?(DEFINE)(?<info>(?-s:(?![ ]*\b(?:Theme[ ]+Name:|Theme[ ]+URI:|Description:|Author:|Author[ ]+URI:|Version:|License:|License[ ]+URI:|Tags:)).)*))/'
(?s) # Dot all modifier
^ # BOS
# Series of lookaheads, optional (ie. independent order, change if need be)
(?=
.*? \b Theme [ ]+ Name: [ ]*
(?<theme_name> (?&info) ) # (1), Theme, optional
)?
(?=
.*? \b Theme [ ]+ URI: [ ]*
(?<theme_uri> (?&info) ) # (2), Theme URI, optional
)?
(?=
.*? \b Description: [ ]*
(?<desc> (?&info) ) # (3), Description
)?
(?=
.*? \b Author: [ ]*
(?<author> (?&info) ) # (4), Author
)?
(?=
.*? \b Author [ ]+ URI: [ ]*
(?<author_uri> (?&info) ) # (5), Author URI
)?
(?=
.*? \b Version: [ ]*
(?<version> (?&info) ) # (6), Version
)?
(?=
.*? \b License: [ ]*
(?<license> (?&info) ) # (7), License
)?
(?=
.*? \b License [ ]+ URI: [ ]*
(?<license_uri> (?&info) ) # (8), License URI
)?
(?=
.*? \b Tags: [ ]*
(?<tags> (?&info) ) # (9), Tags
)?
(?(1) # Conditional, Fail if nothing matched
| (?(2)
| (?(3)
| (?(4)
| (?(5)
| (?(6)
| (?(7)
| (?(8)
| (?(9)
| (?!)
)
)
)
)
)
)
)
)
)
(?(DEFINE) # Subroutines
(?<info> # (10 start), Info
(?-s: # Cluster, data is on same line (remove '-s' if need be)
(?! # Blacklist - Not any other categories, add more here
[ ]* # trim spaces before next block
\b
(?:
Theme [ ]+ Name:
| Theme [ ]+ URI:
| Description:
| Author:
| Author [ ]+ URI:
| Version:
| License:
| License [ ]+ URI:
| Tags:
)
)
. # Grab a data character
)* # End Cluster, do 0 to many times
) # (10 end)
)输出:用于多行样本:
** Grp 0 - ( pos 0 , len 0 ) EMPTY
** Grp 1 - ( pos 16 , len 8 )
ColorWay
** Grp 2 - ( pos 37 , len 53 )
http://www.inkthemes.com/wp-themes/colorway-wp-theme/
** Grp 3 - ( pos 105 , len 699 )
Colorway is Simple, Elegant, Responsive and beautiful Theme with Easy Customization Options built by InkThemes.com. The Customization Options includes using your own Logos, Backgrounds, Analytics and your own Custom Footer Texts and Analytics that can be tweaked using Theme Options Panel. Colorway Theme is Single Click Intall feature, Just press activate button and your website will get ready with all the dummy content. Just set the content from the Themes Options Panel. Colorway by InkThemes.com is suitable for any business or personal website. The Theme can work for various different niches. It includes special styles for Gallery pages, and has an optional fullwidth page template as well.
** Grp 4 - ( pos 814 , len 13 )
InkThemes.com
** Grp 5 - ( pos 841 , len 24 )
http://www.inkthemes.com
** Grp 6 - ( pos 876 , len 5 )
2.5.1
** Grp 7 - ( pos 892 , len 26 )
GNU General Public License
** Grp 8 - ( pos 933 , len 11 )
license.txt
** Grp 9 - ( pos 952 , len 221 )
black, blue, green, white, gray, custom-menu, dark, two-columns, fixed-width, custom-header, custom-background, threaded-comments, sticky-post, custom-colors, custom-header, custom-menu, light, theme-options, editor-style
** Grp 10 - NULL 输出:用于单行样本:
** Grp 0 - ( pos 0 , len 0 ) EMPTY
** Grp 1 - ( pos 15 , len 8 )
ColorWay
** Grp 2 - ( pos 35 , len 47 )
www.inkthemes. com/wp-themes/colorway-wp-theme/
** Grp 3 - ( pos 96 , len 700 )
Colorway is Simple, Elegant, Responsive and beautiful Theme with Easy Customization Options built by InkThemes.com. The Customization Options includes using your own Logos, Backgrounds, Analytics and your own Custom Footer Texts and Analytics that can be tweaked using Theme Options Panel. Colorway Theme is Single Click Intall feature, Just press activate button and your website will get ready with all the dummy content. Just set the content from the Themes Options Panel. Colorway by InkThemes .com is suitable for any business or personal website. The Theme can work for various different niches. It includes special styles for Gallery pages, and has an optional fullwidth page template as well.
** Grp 4 - ( pos 805 , len 13 )
InkThemes.com
** Grp 5 - ( pos 831 , len 19 )
www. inkthemes. com
** Grp 6 - ( pos 860 , len 5 )
2.5.1
** Grp 7 - ( pos 875 , len 26 )
GNU General Public License
** Grp 8 - ( pos 915 , len 11 )
license.txt
** Grp 9 - ( pos 933 , len 224 )
black, blue, green, white, gray, custom-menu, dark, two-columns, fixed-width, custom-header, custom-background, threaded-comments, sticky-post, custom-colors, custom-header, custom-menu, light, theme-options, editor-style */
** Grp 10 - NULL https://stackoverflow.com/questions/28027012
复制相似问题