Skip to content

Sitemaps: Fix video sitemap generation #43333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 5, 2025
Merged

Conversation

tbradsha
Copy link
Contributor

@tbradsha tbradsha commented May 1, 2025

Video sitemaps were only generating partial data due to a mismatch between what the passed array looked like and what the array was expected to look like.

It also resulted in lots of these PHP warnings:

PHP Warning:  Array to string conversion in /modules/sitemaps/sitemap-buffer-video-xmlwriter.php on line 69

Reported in p1745995831728999/1745855658.393169-slack-C01U2KGS2PQ; related to #42767.

Proposed changes:

Rather than the needed content being in an array of items in the videos key, they were directly in a single array under the video:video key. I adjusted the logic to match this and also added an early return to the function.

I'll note that the schema does allow for multiple <video:video> arrays (see example here), but I couldn't seem to reproduce any instances where actually pass an array of videos. There is a filter in jetpack_sitemap_video_sitemap_item that could modify this content, but I don't see anywhere we do. I tested with a standalone site as well as WordPress.com Atomic site, and both gave me a single video.

I also updated the test, which had the errant(?) structure.

XML generated on a site using the old DOMDocument generator
<?xml version="1.0" encoding="UTF-8"?>
<!--generator='jetpack-14.6-a.9'-->
<!--Jetpack_Sitemap_Buffer_Video-->
<?xml-stylesheet type="text/xsl" href="//example.com/?jetpack-sitemap=video-sitemap.xsl"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
  <url>
    <loc>https://example.com/?attachment_id=69</loc>
    <lastmod>2025-05-01T15:41:27Z</lastmod>
    <video:video>
      <video:title>some_video</video:title>
      <video:thumbnail_loc>https://s0.wp.com/i/blank.jpg</video:thumbnail_loc>
      <video:description></video:description>
      <video:content_loc>https://example.com/wp-content/uploads/2025/05/example_video.mp4</video:content_loc>
    </video:video>
  </url>
</urlset>
XML generated on a site using the new XMLWriter generator (prior to this PR)
<?xml version="1.0" encoding="UTF-8"?>
<!--generator='jetpack-14.6-a.9'-->
<!--Jetpack_Sitemap_Buffer_Video_XMLWriter-->
<?xml-stylesheet type="text/xsl" href="//example.com/?jetpack-sitemap=video-sitemap.xsl"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
 <url>
  <loc>https://example.com/?attachment_id=69</loc>
  <lastmod>2025-05-01T15:41:27Z</lastmod>
  <video:video>Array</video:video>
 </url>
</urlset>
Array we pass to `Jetpack_Sitemap_Buffer_Video_XMLWriter::append_item()`
array (
  'url' => 
  array (
    'loc' => 'https://example.com/?attachment_id=69',
    'lastmod' => '2025-05-01T15:41:27Z',
    'video:video' => 
    array (
      'video:title' => 'some_video',
      'video:thumbnail_loc' => 'https://s0.wp.com/i/blank.jpg',
      'video:description' => '',
      'video:content_loc' => 'https://example.com/wp-content/uploads/2025/05/example_video.mp4',
    ),
  ),
)
Array structure the code was expecting
array (
  'url' => 
  array (
    'loc' => 'https://example.com/?attachment_id=69',
    'lastmod' => '2025-05-01T15:41:27Z',
    'videos' => 
    array (
      array(
        'title' => 'some_video',
        'thumbnail_loc' => 'https://s0.wp.com/i/blank.jpg',
        'description' => '',
        'content_loc' => 'https://example.com/wp-content/uploads/2025/05/example_video.mp4',
      ),
      // ... potentially more items here?
    ),
  ),
)

Other information:

  • Have you written new tests for your changes, if applicable?
  • Have you checked the E2E test CI results, and verified that your changes do not break them?
  • Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with a script to run)?

Jetpack product discussion

Does this pull request change what data or activity we track or use?

Testing instructions:

  1. Add a video to your site: /wp-admin/upload.php
  2. Ensure your site is public and allowed to be indexed by search engines: /wp-admin/options-reading.php
  3. Ensure Sitemaps are enabled: /wp-admin/admin.php?page=jetpack#/settings?term=sitemaps
  4. Generate a new sitemap with wp jetpack sitemap rebuild --purge or by running the jp_sitemap_cron_hook cron.
  5. Go to the sitemap and verify the Video URL and other fields are filled out: /video-sitemap-1.xml

@tbradsha tbradsha added [Type] Bug When a feature is broken and / or not performing as intended [Status] Needs Review This PR is ready for review. Bug labels May 1, 2025
@tbradsha tbradsha requested review from kraftbj and a team May 1, 2025 18:09
@tbradsha tbradsha self-assigned this May 1, 2025
Copy link
Contributor

github-actions bot commented May 1, 2025

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (Jetpack), and enable the fix/jetpack/sitemap_notices branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack fix/jetpack/sitemap_notices

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

@github-actions github-actions bot added [Feature] Sitemaps [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ labels May 1, 2025
Copy link
Contributor

github-actions bot commented May 1, 2025

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add a "[Type]" label (Bug, Enhancement, Janitorial, Task).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!


Jetpack plugin:

No scheduled milestone found for this plugin.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

Copy link

jp-launch-control bot commented May 1, 2025

Code Coverage Summary

Coverage changed in 1 file.

File Coverage Δ% Δ Uncovered
projects/plugins/jetpack/modules/sitemaps/sitemap-buffer-video-xmlwriter.php 28/29 (96.55%) -3.45% 1 ❤️‍🩹

Full summary · PHP report · JS report

Copy link
Contributor

@anomiex anomiex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Haven't tested though.

Looks like there's a test that's passing the "videos" rather than "video:video" that needs to be updated though. 😀

if ( ! empty( $array['url'] ) ) {
$this->writer->startElement( 'url' );
// Return early if missing fundamental data.
if ( empty( $array['url']['video:video'] ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note this is a slight behavior change, in that if there isn't any video:video data then it'll skip the <url> entirely now instead of outputting it like the DOMDocument generator does.

But possibly that behavior of the DOMDocument generator is incorrect. 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I consciously made this change; if there's no video data, then it doesn't make sense to map to a page with the missing data. I could even see search engines penalizing this if they wanted.

@tbradsha
Copy link
Contributor Author

tbradsha commented May 1, 2025

Looks like there's a test that's passing the "videos" rather than "video:video" that needs to be updated though.

@kraftbj I've updated the test to match this, but I'm curious where you got videos structure from the original test you put in...I'm worried I'm missing some weird scenario.

@kraftbj
Copy link
Contributor

kraftbj commented May 5, 2025

@tbradsha I really struggled with the video sitemap before so I think this was a bug that I didn't understand. Thanks for fixing it.

@kraftbj kraftbj added [Status] Ready to Merge Go ahead, you can push that green button! and removed [Status] Needs Review This PR is ready for review. labels May 5, 2025
@tbradsha tbradsha merged commit de20644 into trunk May 5, 2025
71 of 72 checks passed
@tbradsha tbradsha deleted the fix/jetpack/sitemap_notices branch May 5, 2025 15:55
@github-actions github-actions bot removed the [Status] Ready to Merge Go ahead, you can push that green button! label May 5, 2025
simison pushed a commit that referenced this pull request May 6, 2025
* Properly handle item structure

* Rework some logic and return early

* Add changelog

* Fix test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug [Feature] Sitemaps [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Tests] Includes Tests [Type] Bug When a feature is broken and / or not performing as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants