Skip to content

Commit 22eff2d

Browse files
committed
Process imported school URLs
The GIAS CSV contains some school addresses that don't have a protocol. As such, they are not valid URLs according to validate_url. To work around this, we can force prepend `https://`, which assumes that the school websites have HTTPS setup. This isn't a given, but it feels like the better default, since HTTPS is pretty strongly pushed by browser vendors. The URL for Legh Vale Primary School is also plainly malformed and starts with `http:www`. It's the only instance of this, but we have to work around it.
1 parent 4acd614 commit 22eff2d

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

lib/tasks/schools.rake

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,16 @@ namespace :schools do
7979
)
8080
# rubocop:enable Rails/SaveBang
8181

82+
# Some URLs from the GIAS CSV are missing the protocol.
83+
process_url = ->(url) do
84+
return nil if url.blank?
85+
url.start_with?("http://", "https://") ? url : "https://#{url}"
86+
87+
# Legh Vale school has a URL of http:www.leghvale.st-helens.sch.uk
88+
# which is not a valid URL.
89+
url.gsub!("http:www", "http://www")
90+
end
91+
8292
CSV.parse(
8393
csv_content,
8494
headers: true,
@@ -96,7 +106,7 @@ namespace :schools do
96106
town: row["Town"],
97107
county: row["County (name)"],
98108
postcode: row["Postcode"],
99-
url: row["SchoolWebsite"]
109+
url: process_url.call(row["SchoolWebsite"].presence)
100110
)
101111

102112
if locations.size >= batch_size

0 commit comments

Comments
 (0)