Commit Graph

33 Commits

Author SHA1 Message Date
Andrew Dolgov 304d3a0b88 tag-related fixes
1. move tag sanitization to feedparser common item class
2. enforce length limit on tags when parsing
3. support multiple tags passed via one dc:subject and other such elements, parse them as a comma-separated list
4. sort resulting tag list to prevent different order between feed updates
5. remove some duplicate code related to tag validation
6. allow + symbol in tags
2019-11-20 18:56:34 +03:00
Andrew Dolgov 55ef85adc0 parser: clean() attribute values by default (except content) 2018-12-26 10:16:11 +03:00
Andrew Dolgov 54727f9534 parser: move media:element handling to feeditem_common; use media:content @media attribute to generate placeholder content-type if not specified 2018-08-21 07:01:26 +03:00
Tobias Kappé 22a866edb5 Store language of entries as indicated by the feed. 2018-08-12 15:27:26 +01:00
Andrew Dolgov ea79a0e033 remove some redundant php closing tags 2017-04-26 20:24:18 +03:00
Andrew Dolgov 7d1e15c396 parser: properly support tag subtrees instead of text content for article content 2016-01-23 01:48:32 +03:00
Andrew Dolgov d2bb392bae Revert "parser: use node->c14n() instead of expecting html in nodeValue"
This reverts commit 1383514ad9.
2016-01-23 01:24:13 +03:00
Andrew Dolgov 1383514ad9 parser: use node->c14n() instead of expecting html in nodeValue 2016-01-23 01:04:24 +03:00
Andrew Dolgov 206326c219 feedparser: xpath doesn't properly query for title element if there's a default namespace so let's add a separate ugly hack for rdf:RDF feeds, thanks for that xml dipshits 2015-01-19 21:40:20 +03:00
zaikos 2b4853f515 Reverts most of be60340. Implements a simplier solution using XPath to get the proper title tag from a feed item. 2015-01-14 16:13:39 -05:00
zaikos be60340c29 Made FeedItem_RSS::get_title() more aggresive in finding an article title. 2015-01-14 13:28:58 -05:00
Felix Eckhofer 523bd90baf Store size of enclosure to database 2014-07-15 16:23:46 +02:00
Andrew Dolgov 31bd6f7643 parser: trim some some feed-extracted data link titles and links 2014-03-04 16:38:04 +04:00
Andrew Dolgov 2ab7ccb695 parser: fix failing on empty media:group tags 2014-01-12 08:53:30 +04:00
Andrew Dolgov f6c61b2d55 rss: choose between description and content:encoded based on which one is longer because publishers are idiots and can't use tags properly 2013-12-19 13:19:30 +04:00
Andrew Dolgov e23aedd402 parser: add basic support for media:thumbnail 2013-12-15 12:35:30 +04:00
Jeffrey Tolar ed449a9aaa Follow the spec for <media:group>s
Each <media:group> section specifies multiple representations of the
same content.
2013-11-17 17:58:43 -06:00
Andrew Dolgov 5c54e68388 support media:description for media: enclosures 2013-08-05 12:26:09 +04:00
Andrew Dolgov 6bf61bdc63 simplify media:content xpath 2013-08-05 11:50:15 +04:00
Andrew Dolgov 4289b68f0d parser: support media:content elements within media:group 2013-08-05 10:33:13 +04:00
Andrew Dolgov ce5d234d63 support dc:date elements in rss and atom feeds 2013-06-01 09:49:56 +04:00
Andrew Dolgov df2655e015 better support for atom:link elements in rss feeds, support rel=standout (fuck you google and your nonstandard shit) 2013-05-26 10:21:54 +04:00
Andrew Dolgov 042003d55e parser/rss: try to get link from guid isPermaLink=true 2013-05-20 15:01:18 +04:00
Andrew Dolgov 2f6b75d574 fix atom:link not supported in rss feeds (fucking fuck) (2) 2013-05-17 22:57:18 +04:00
Andrew Dolgov f7d64d03fc fix atom:link not supported in rss feeds (fucking fuck) 2013-05-17 22:50:38 +04:00
Andrew Dolgov 99b8256794 feedparser: make content:encoded take precedence over description 2013-05-02 10:30:41 +04:00
Andrew Dolgov 8a95d630a9 fix rss content:encoded not used 2013-05-01 22:05:59 +04:00
Andrew Dolgov b4d1690097 move common methods to feeditem_common 2013-05-01 21:06:48 +04:00
Andrew Dolgov f11015058d support dc:creator 2013-05-01 21:01:30 +04:00
Andrew Dolgov d4992d6b48 add support for dc:subject and slash:comments 2013-05-01 20:55:08 +04:00
Andrew Dolgov 4c00e15b5d pass xpath object to feeditem, support media-rss objects 2013-05-01 19:40:43 +04:00
Andrew Dolgov b09a4cdccc feeditem_rss: use guid element 2013-05-01 19:12:32 +04:00
Andrew Dolgov 04d2f9c831 add basic rss support 2013-05-01 17:38:16 +04:00