Change Log

[25.6.1] - 2025-12-08

bump django to 5.2.8
fix: cap length for elasticsearch document id

[25.6.0] - 2025-10-30

bump dependencies
- celery to 5.5.3
- kombu to 5.5.4
improve error handling in celery task-result backend
use logging config in celery worker
improve code docs (README.md et al.)
add cardsearch feeds (rss and atom)
- /trove/index-card-search/rss.xml
- /trove/index-card-search/atom.xml
fix: render >1 result in streamed index-value-search (csv, tsv, json)
when browsing trove api in browser, wrap non-browser-friendly mediatypes in html (unless withFileName, which requests download)
better trove.render test coverage
code cleanliness
- de-collide "simple" names
  - SimpleRendering => EntireRendering
  - SimpleTrovesearchRenderer => TrovesearchCardOnlyRenderer
- consolidate more shared logic into trove.util
- more accurate type annotations

[25.5.0] - 2025-07-15

use python 3.13
use poetry to manage dependencies
upgrade various dependencies
start using mypy for type-checking (loosely)

[25.4.0] - 2025-06-24

delete RawDatum model
- trove.digestive_tract.extract now must succeed before /trove/ingest responds
rename IndexcardRdf (and kids) to ResourceDescription
move most django models to their own files
stop storing CeleryTaskResults forever
- new environment variables: CELERY_RESULT_EXPIRES, FAILED_CELERY_RESULT_EXPIRES
fix: /api/v2/ error generating rss/atom feed links
fix: pagination at /api/v2/sourceconfigs
fix: correct osfmap IRIs (dcat:accessURL, osf:verifiedLink)

[25.3.3] - 2025-06-17

smaller osfmap_json derived representation (thx bodintsov)
prepare for next release dropping RawDatum model/table:
- mirror expiration_date database column from RawDatum to IndexcardRdf
- add management command migrate_rawdatum_expiration to copy old values

[25.3.2] - 2025-05-21

fix: avoid sharev2_elastic queue backups by acking more correctly

[25.3.1] - 2025-05-20

fix: more consistent suggestedFilterOperator values in json api
configurable rabbitmq connection heartbeat timeout via RABBITMQ_HEARTBEAT_TIMEOUT env var

[25.3.0] - 2025-05-07

remove search-text parsing from base trovesearch params (syntax may now vary by index strategy)
add search-text syntax to trovesearch_denorm index strategy (using elasticsearch simple_query_string)
add osf:verifiedLinks entry to osfmap thesaurus
remove trove_indexcard_flats index strategy, a cautionary tale of elasticsearch nested (which is already cautioned against by its own docs, yes)
add SimpleChainMap util, alternative to collections.ChainMap that doesn't do updates (uses more permissive Mapping type over MutableMapping
add BasicTroveParams (shared params for all trove endpoints)
add/use base trove.views:
- BaseTroveView: parses BaseTroveParams, renders rdf data (response content) accordingly
- StaticTroveView: responds with same static rdf data every time
- GatheredTroveView: gathers rdf data via given primitive_metadata.gather.GatheringOrganizer
fix /trove/browse?iri=... and /trove/index-card/...
updo html rendering of /trove/ responses
add landing page of static data (links to docs, etc) rendered same way as /trove/ responses
easier editing feature flags via /admin/ (list-view checkboxes)
remove no-longer-used feature flag TROVESEARCH_DENORMILY
add better "end to end" search-api tests

[25.2.0] - 2025-05-05

further move on from SHAREv2...
delete sharev2 ingestion pipeline
- share/harvest/*
- share/harvesters/*
- share/metadata_formats/*
- share/regulate/*
- share/schema/*
- share/sources/*
- share/tasks/*
  - except schedule_index_backfill -- moved to share.models.index_backfill for now
- share/transform/*
- share/transformers/*
- anything deactivated by the ignore_sharev2_ingest feature flag
delete (some) sharev2 db models/tables from share/models/...
- core.py: NormalizedData, FormattedMetadataRecord
- jobs.py: HarvestJob
- registration.py: ProviderRegistration
- sources.py: SourceStat
delete (some) sharev2 api
- /api/v2/formattedmetadatarecords/...
- /api/v2/normalizeddata/...
  - note: sharev2 “push” is a POST to this endpoint -- replaced by /trove/ingest
- /api/v2/sourceregistrations/...
- /api/v2/schemas/...
- /api/v1/share/data
delete sharectl (share/bin/*)
- prefer django management commands, for now
add management commands
- shtrove_indexer_run (replaces sharectl search daemon)
- shtrove_search_setup (replaces sharectl search setup)
- shtrove_search_teardown (replaces sharectl search purge)
- delete_pretrove_data (for letting go of some past)
remove special ember-share handling (for local dev)
remove all dead code and requirements easily removed
update github actions flow (with more accurate code coverage)

[25.1.1] - 2025-03-04

reduce wasteful text-field indexing (better this time)
on the share-admin search-indexes page:
- require typed confirmation when deleting indexes
- allow deleting way more indexes
(for osf search ui) update link description in osfmap

[25.1.0] - 2025-02-24

update IndexStrategy to allow multiple indexes within a strategy
trovesearch_denorm index strategy updates:
- multiple indexes: one for card-search, one for value-search on iri values
- skip indexing some text fields (e.g. *.identifier, glob-paths of depth > 1)
reduce wasteful computing (fewer queries, less hashing)
add to metadata for osfmap:affiliation
improve local setup, perhaps

[25.0.0] - 2025-01-06

update calendar version to 25, reset semantic versions to 0
trove-search api:
- support jsonapi fields[TYPE] query params; see https://jsonapi.org/format/#fetching-sparse-fieldsets
- when TYPE in fields[TYPE] matches the value of a cardSearchFilter[resourceType] query param, interpret the given fields as shorthand property-paths and use for custom csv/tsv columns
- streaming "simple json" rendering (acceptMediatype=application/json)
- when sorting by integer values, treat missing values as zero (tho there may be future times this is wrong...)

[24.7.0] - 2024-12-27

allow rendering search responses as downloadable CSVs/TSVs
add, reshape renderer output types

[24.6.2] - 2024-12-05

more stable indexer daemon
trovesearch_denorm indexing tweaks:
- move iri-value delete_by_query into followup task
suggest affiliation instead of creator.affiliation for osf:Preprint searches
local docker-compose: give worker access to elasticsearch

[24.6.1] - 2024-11-27

fix(trovesearch_denorm): keep iris whole in path-based fieldnames

[24.6.0] - 2024-11-27

ignore trailing slashes on iri values
have more than one shard and replicas

[24.5.0] - 2024-11-14

fix for M chips with docker
add "subject" related property for cardSearchFilter[resourceType]=Project
allow "supplementary" metadata records
allow expiration date on metadata records
osfmap: add properties with shorthands
add trovesearch_denorm index strategy (more denormalized for better scaling)
dependency updates
many tests

[24.4.0] - 2024-05-31

specific exception classes within trove
better search api error responses
better search-api html experience
more static vocabs
fix various errors

[24.3.1] - 2024-05-23

fix: jsonapi renderer now chooses type consistently

[24.3.0] - 2024-05-22

speed up oai-pmh queries
improve trove simple-json and html experience

[24.2.0] - 2024-05-21

add "simple json" renderer for search api responses

[24.1.0] - 2024-05-20

update django to 3.2.25
fix oai-pmh feed

[24.0.0] - 2024-02-13

add osfmap:hasCedarTemplate to trove.vocab

[23.4.1] - 2023-11-10

fix: allow date literals for legacy sharev2_elastic deriver

[23.4.0] - 2023-11-09

add docs:
- /trove/docs/openapi.json
- /trove/docs/openapi.html
- /vocab/2023/trove/...

[23.3.0] - 2023-10-24

allow adding propertypaths to cardSearchText and valueSearchText
- e.g. cardSearchText[creator.name]=...
anywhere a set of propertypaths is encoded in query params, allow simple glob-paths ("", ".", "..") that match any propertypath of the given length
- note: partial globs (e.g. ".name" or "publisher.") are not supported (...yet?)
when an iri value returned by an index-value-search has a full index-card, include that index-card instead of the stub built from indexed values
friendlier FeatureFlag admin list

[23.2.0] - 2023-10-09

BREAKING: allow multiple propertypaths in query params
- use . to delimit steps in a path; e.g. creator.affiliation is a path of two steps (previously would be creator,affiliation)
- use , to delimit multiple paths; e.g. creator.name,contributor.name would be two paths (previously impossible)
- hidden behind feature flag: periodic_propertypaths
add missing OSFMAP shorthands
fix: in index-card-search, do not show "next" link when no results

[23.1.9] - 2023-08-29

more consistent pagination over randomly ordered results
correct test setup for trove_indexcard_flats

[23.1.8] - 2023-08-28

skip "first" link from first page

[23.1.7] - 2023-08-28

disable pagination on large, randomly-sorted result sets

[23.1.6] - 2023-08-28

more efficient random sort (for sorting by relevance to nothingness)

[23.1.5] - 2023-08-28

remove trove_indexcard (fully replaced by trove_indexcard_flats)
trove_indexcard_flats updates:
- log search queries when in DEBUG mode
- disable "unnamed filter values" aggregations (expensive and yet unused)

[23.1.4] - 2023-08-24

fix: trove_indexcard_flats would clobber some iri values while flattering
skip indexing cards that don't have osfmap_json
more gracefully handle erroneously circular skos:Concept hierarchies

[23.1.3] - 2023-08-23

lil optimization to skip unhelpful aggregations
disable tests using elasticsearch5 on github actions
- (will soon reenable or remove elastic5 altogether)

[23.1.2] - 2023-08-23

add trove_indexcard_flats index strategy
- copy of trove_indexcard with flatter queries (and more info on the root doc)
fix: allow more than 11 related properties on an index-card-search to have non-zero count

[23.1.1] - 2023-08-23

small improvements to trove_indexcard index strategy
- skip indexing metadata with osfmap:contains in the path (don't index file metadata with its container)
- better consolidate nested_iri to reduce number of nested docs

[23.1.0] - 2023-08-22

introducing "trove"
- store metadata records as small rdf documents called "index cards"
- ingest rdf
- add iri-centric search
"shtrove": working to preserve back-compat (because trove may be trouble)

[23.0.6] - 2023-05-11

make SourceConfig.disabled prevent harvest tasks running

[23.0.5] - 2023-05-11

downgrade to python 3.10 (for now)
improve logging
- replace raven (deprecated) with sentry-sdk
- add logging formatter for json with severity (for logging in deployments)
remove squashed migrations, dead code

[23.0.4] - 2023-05-05

fix a typo

[23.0.3] - 2023-05-05

admin interface: allow re-ingesting all data for a source config (see "ingest" buttons at /admin/share/sourceconfig/)
address possible cause of some backfill gaps
fix logging errors

[23.0.0] - 2023-05-03

upgrade to python 3.11
upgrade to elasticsearch 8
add share.search.index_strategy to act as a slippery abstraction layer between search-engine backend and planned friendly search api
- configure two index strategies (and make it easy to add more in the future):
  - sharev2_elastic5: the existing/legacy SHAREv2 search index as exists on elasticsearch5 and exposed via /api/v2/search/creativeworks/_search
  - sharev2_elastic8: a mirror/replacement for sharev2_elastic5 with all the same _source docs (but possible incompatibilities for the existing pass-thru api)
add a happy-path index-backfill workflow to the admin interface at /admin/search-indexes
- when changing index-strategy settings/mappings/whatever, the "happy path" is to create, backfill, verify a new copy of the index; then switch which is used for searching, verify again, and finally delete the old index.
- not intended to have the power of a full elasticsearch management interface -- just enough visibility to see whether things are going ok and where to start looking if something goes wrong
for testing, support indexStrategy query param to /api/v2/search/creativeworks/_search, /api/feeds/rss, /api/feeds/atom
- may request a configured strategy (e.g. indexStrategy=sharev2_elastic8) or a specific version of an index within a strategy (e.g. indexStrategy=sharev2_elastic8__bcaa90e8fa8a772580040a8edbedb5f727202d1fca20866948bc0eb0e935e51f)
add FeatureFlag model, use it to switch default search strategy (name="elastic_eight_default")

[22.0.1] - 2022-08-29

add suid value to sharev2_elastic index

[22.0.0] - 2022-08-29

easy additive elastic mapping changes
add osf_related_resource_types field
dockerfile updates

[21.3.1] - 2021-07-28

update raven

[21.3.0] - 2021-07-28

update and consolidate docs
audit and upgrade all dependencies
switch to github actions for tests/ci

[21.2.2] - 2021-05-25

fix: feeds should not break on null date_published

[21.2.1] - 2021-05-25

fix: oai_dc formatter breaks on deletions

[21.2.0] - 2021-05-25

big rend! remove many things:
- concepts:
  - merging data from multiple sources together (aiming instead for a simple, robust repository of metadata records -- let's talk later/soon about how we might do merging well)
- models:
  - ShareObject and all its descendents
  - ShareObjectVersion and all its descendents
  - Change
  - ChangeSet
  - SubjectTaxonomy
  - UnusedCeleryProviderTask
  - UnusedCeleryTask
- api routes:
  - all auto-generated ShareObject routes (e.g. /api/v2/creativeworks/)
  - all schema routes (except the root /api/v2/schema/)
    - auto-generated schema routes (e.g. /api/v2/schema/disputes/)
    - work type hierarchy (/api/v2/schema/creativeworks/hierarchy/)
  - /api/v2/graph/
admin features/improvements
- add FormattedMetadataRecord admin
- when investigating a problem, start by finding the suid and navigate relationships from there
- add action to delete all FormattedMetadataRecords for some chosen suid(s) (good for spam control)

[21.1.4] - 2021-05-17

fix a 500 error at /api/v2/
fix sending useful debugging info to sentry

[21.1.3] - 2021-05-05

make the oai-pmh feed respect switch-flipping

[21.1.2] - 2021-05-05

give an accurate date_created in sharev2_elastic formatter
fix admin bug -- don't hide the search box
add django-debug-toolbar to dev dependencies

[21.1.1] - 2021-05-04

tidy up some admin inefficiencies

[21.1.0] - 2021-04-21

expose a few models in read-only json:api, so the frontend can be useful given a suid
- /api/v2/formattedmetadatarecords/
- /api/v2/sourceconfigs/
- /api/v2/suids/
add new atom/rss feeds that get results from the new backcompat index
- /api/v2/feeds/atom/
- /api/v2/feeds/rss/
- (old feeds now deprecated, will be gone with ShareObject)

[21.0.8] - 2021-04-01

add --pls-reingest arg to format_metadata_records command

[21.0.7] - 2021-04-01

fix: facility != funder (in gov.clinicaltrials transformer)

[21.0.6] - 2021-04-01

remove feature: oai_dc formatter no longer puts first author last
add utility: share.util.names.get_related_agent_name for consistently getting an agent name from an "agent-work relation" node
- if missing both cited_as and name (true of some old, unregulated production data), reluctantly apply some cultural assumptions and build a name from parts (given_name, additional_name, family_name, suffix)

[21.0.5] - 2021-03-12

bugfix: in share.util.graph, handle merging nodes with dictionary values
bugfix: when formatting oai_dc, strip characters illegal in XML
when regulating, discard gravatars as agent identifiers

[21.0.4] - 2021-03-11

bugfix: deduping subjects in custom taxonomies

[21.0.3] - 2021-03-10

fix up populate_osf_suids with more useful messaging
improve "central node" guessing to handle old osf data on prod

[21.0.2] - 2021-03-09

speed up populate_osf_suids -- exclude NormalizedData with null raw, since they'll be ignored anyway

[21.0.1] - 2021-03-09

fix populate_osf_suids script to handle fun situations

[21.0.0] - 2021-03-09

new model: FormattedMetadataRecord
new sharectl commands:
- sharectl search purge
- sharectl search setup <index_name>
- sharectl search setup --initial
- sharectl search set_primary <index_name>
- sharectl search reindex_all_suids <index_name>
new management commands:
- format_metadata_records
- populate_osf_suids
new doc: README-docker-quickstart.md -- the easy way to get started
define the "share schema" statically (in share.schema)
- stop inferring everything from the ShareObject models
add a parallel ingestion path, preparing for a future without ShareObject
- use only the most recent NormalizedData for each suid (no merging)
- allow explicitly stating the suid when pushing a NormalizedData
  - if not specified, try looking for an OSF guid
- build a FormattedMetadataRecord for each metadata format
- currently two metadata formatters (and room for more):
  - sharev2_elastic: for a back-compatible elasticsearch index -- builds a document just like share.search.fetchers.CreativeWorkFetcher, but from a NormalizedData instead of all the ShareObject tables
  - oai_dc: dublin core XML, for the OAI-PMH feed
indexer daemon overhaul
- assorted cleanup; dead/useless code removal
- add ElasticManager to encapsulate all requests sent to elasticsearch
- add IndexSetup concept to describe how to get/build documents for an index and what messages to send to that index's daemon
- currently two index setups:
  - share_classic: index by AbstractCreativeWork id, using existing share.search.fetchers logic
  - postrend_backcompat: index by SourceUniqueIdentifier id, using the sharev2_elastic FormattedMetadataRecords
add a parallel OAI-PMH that uses FormattedMetadataRecord with oai_dc
- remains dormant for the moment -- enable with pls_trove query param
- NOTE: when we switch over, OAI-PMH datestamps will all be new and recent
admin updates:
- search IngestJob by suid value

[20.2.0] - 2020-09-03

Add a decorator for marking views deprecated
Mark some views deprecated
Sources added via API default to canonical

[20.1.0] - 2020-06-16

Automatically schedule ingest tasks after harvesting
Schedule ingest tasks in admin reenqueue action
Pin faker to 4.0.3
Update .travis.yml
Fix bug in io.osf.registrations transformer

[20.0.4] - 2020-01-13

Ensure order in oai-pmh

[20.0.3] - 2020-01-09

Exclude frankenworks from oai-pmh

[20.0.2] - 2020-01-06

Reduce oai-pmh page size

[20.0.1] - 2020-01-03

Pin graphql-relay to a compatible version

[20.0.0] - 2020-01-03

Dockerfile fixes & improvements
Optimize oai-pmh endpoint to avoid timeouts
Add reindex_works shell util

[19.0.6] - 2019-12-06

Pin python-dateutil to a version that doesn't break tests (2.8.0)
Temporarily (i hope) skip tests broken by 19.0.5

[19.0.5] - 2019-12-06

Temporary fix to avoid slow IngestJob queries

[19.0.4] - 2019-02-25

Possibly fix a rare forceingest error

[19.0.3] - 2019-01-04

Skip indexing works with too many agent relations

[19.0.2] - 2019-01-03

Make the indexer more configurable by environment variables

[19.0.1] - 2019-01-02

Fix indexer deadlock

[19.0.0] - 2019-01-02

Allow turning off ingestion (but not harvest) for non-canonical sources
Ingestion perf improvements (faster attr access in MutableGraph)
Handle indexer errors better

[18.0.6] - 2018-12-13

Ingestion perf improvements

[18.0.5] - 2018-10-30

Update requests dependency

[18.0.4] - 2018-10-25

Make it easier to reingest all OSF data

[18.0.3] - 2018-10-24

Fix worker out of memory errors

[18.0.2] - 2018-10-23

Update nameparser dependency

[18.0.1] - 2018-10-23

Add datacite oai-1.1 schema namespace
Fix common datacite transform errors

[18.0.0] - 2018-10-23

Update django to 1.11.16
Clean up disambiguation logic to make extending it less painful
Extend disambiguation to match contributors with different name formats
Rename fixpreprintdisambiguations command to forceingest
- Handle more complex merges

[2.16.11] - 2018-08-16

Improve error message for transformer errors
Fix OSF registration transformer

[2.16.10] - 2018-07-30

Update NSF harvester to look farther into the past
Fix a bug in the OSF project harvester
Fix --osf-only flag in fix_datacite command

[2.16.9] - 2018-06-21

When a job is marked "skipped", not even superfluous will re-run it

[2.16.8] - 2018-06-14

All retried jobs should be marked "rescheduled"

[2.16.7] - 2018-06-14

Harvest jobs that are retried when the same source is already being harvested should be marked "rescheduled" rather than "failed"

[2.16.6] - 2018-06-14

Handle OSF harvest errors gracefully

[2.16.5] - 2018-06-04

Pin kombu to 4.1.0

[2.16.4] - 2018-06-04

Harvest all set specs from CSIC
Allow sorting Atom feed by date_created and date_published
Don't create unnecessary source configs for each new source
Update pytest-django dependency to avoid version conflict

[2.16.3] - 2018-06-04

Fix bug in indexer daemon, stop all threads when one dies

[2.16.2] - 2018-04-30

Fix typo in sharectl ingest that prevented bulk reingestion

[2.16.1] - 2018-04-30

Fix date range filtering in com.figshare.v2 harvester

[2.16.0] - 2018-04-26

Bulk reingestion with IngestScheduler.bulk_reingest() and sharectl ingest
Admin interface updates
More stable and reliable indexer daemon
"Urgent" queues for ingestion and indexing, allowing pushed data to jump ahead of harvested data
Various source config updates

[2.15.6] - 2018-04-04

Fix PeerJ transformer error

[2.15.5] - 2018-03-15

Prevent infinite task loop for certain types of errors

[2.15.4] - 2018-03-15

Update raw data janitor to skip over datums from disabled/deleted sources

[2.15.3] - 2018-03-15

Fix bug in fixpreprintdisambiguations command

[2.15.2] - 2018-03-12

Fix a broken test

[2.15.1] - 2018-03-12

Fix some time-sensitive tests

[2.15.0] - 2018-03-05

Ingest architecture

Add IngestJob, used to keep track of a RawDatum's ingestion status
- Exposed in API at /api/v2/ingestjobs/
- In the response to pushed data, include a link to the IngestJob
Rename HarvestLog to HarvestJob
Combine transform and disambiguate tasks into ingest task
Catch all errors caused by bad input data, store them on the IngestJob
Add Regulator, a place to put logic/transforms/validation that should run on all data, regardless of source
Fix: Prevent indexer daemon threads from exiting when elasticsearch times out

Existing sources

Map work relation types in MODS transformer
Update edu.utah source config to include more approved sets
Update edu.umassmed source config to use HTTPS

[2.14.11] - 2018-02-26

Update pendulum dependency to avoid infinite janitor loop

[2.14.10] - 2018-02-26

Fix elasticsearch_janitor task
- Expect (and give) str arguments, avoiding error
- Use the indexer daemon by default

[2.14.9] - 2018-02-22

Speed up update_elasticsearch task:
- Don't count the works just for a log message
- Use the indexer daemon by default, instead of index_model tasks
Only run one update_elasticsearch task at a time

[2.14.8] - 2018-02-22

Add --delete-related and --superfluous flags to enforce_set_lists
Improve script output by including ids in ShareObject.repr

[2.14.7] - 2018-02-18

Devops updates for new environment

[2.14.6] - 2018-02-12

Actually speed up OAI feed

[2.14.5] - 2018-02-12

Speed up OAI feed when filtering by set
Delete merged works with no identifiers in fixpreprintdisambiguations

[2.14.4] - 2018-02-08

Allow omitting arXiv from fix_datacite script

[2.14.3] - 2018-02-05

Add parameters to fix_datacite script

[2.14.2] - 2018-02-01

Changed

Use normalized agent name in Atom feed, instead of cited_as
Update psycopg dependency

[2.14.1] - 2018-01-18

Added

Type map for Columbia Academic Commons (edu.columbia)
Type map for University of Cambridge (uk.cambridge)

[2.14.0] - 2018-01-10

Added

Allow reading/writing Source.canonical at /api/v2/sources/
Include <author> in atom feed at /api/v2/atom/
ScholarsArchive@OSU source config for their new API

Changed

Prevent OSF harvester from being throttled
Update NSFAwards harvester/transformer to include more fields

[2.13.1] - 2018-01-04

Fixed

Use request context to build URLs in the API instead of SHARE_API_URL setting
- Stop displaying localhost:8000 links

Added

Add --from parameter to fixpreprintdisambiguations management command

[2.13.0] - 2017-12-18

Added

Support for set blacklists for sources that follow OAI-PMH protocol
- enforce_set_lists command to enforce set blacklist and whitelist
Set whitelist for UA Campus Repository
Support for encrypted json field and start using it in SourceConfig model
Enable Coveralls
Include work lineage (based on IsPartOf relations) in the search index payload
Add self links to objects returned by the API

Changed

Collect metadata in MODS format from UA Campus Repository
Update columbia.edu harvester source config (disabled set to false)
Improve creating Sources at /api/v2/sources/
- Use POST to create, PATCH to update
- Respond with sensical status codes (409 on name conflict, etc.)

Fixed

Backfill CHANGELOG.md to include 2.10.0 and 2.11.0
Correctly encode &, <, > characters in the Atom feed
Avoid DB connection leak by disabling persistent connections

[2.12.0] - 2017-09-14

Added

editsubjects management command to modify share/subjects.yaml

Changed

Replace share/models/subjects.json with share/subjects.yaml
Update central subjects taxonomy to match Bepress' 2017-07 update

[2.11.0] - 2017-08-27

Added

Symbiota as a source
AEA as a source

Changed

Used django-include for a faster OAI-PMH endpoint
Updated regex for compatibility with Python 3.6

[2.10.0] - 2017-08-03

Added

University of Arizona as a source
NAU Open Knowledge as a source
Started collecting analytics on source APIs (response time, etc.)
Support for custom taxonomies

[2.9.0] - 2017-06-15

Added

sharectl command line tool
Profiling middleware for local development
Janitor tasks to find and process unprocessed data
Timestamp field to RawData
Mendeley Harvester!
Started to use deprecation warning
Timeouts for harvests

Removed

The concept of "Bots"
A lot of dead code
A GPL licenced library

Changed

Upgraded to Celery 4.0
Deleted works now return 403s from the API
Deleted works are now excluded from the API
Corrected to date fields used to audit the Elasticsearch index
Strongly defined the Harvester interface
Harvests are now scheduled in a more friendly manner
Updated the configurations for many OAI sources

Fixed

HarvestLogs no longer get stuck in progress
Text parsing transformer utilties
MODS transformer looks at the location field in addition to other fields for a work identifier

[2.8.0] - 2017-05-12

Added

Elasticsearch Janitor task to keep Postgres and ES in sync
Concurrently added indexes
Admin updates to allow quicker fixing of broken data
More test coverage

Removed

Elasticsearch's scroll API explicitly disabled

Changed

Upgraded to Django 1.11
Elasticsearch now pulls last_modified from itself rather than Postgres

Fixed

API pagination no longer times out on large collections
Timestamps are now included in the ATOM feed

[2.7.0] - 2017-05-04

Added

OAI endpoint
Sources
- OpenBU

Changed

Updated documentation

[2.6.0] - 2017-03-28

Added

Sources
- A table for managing SHARE data sources
- Replaces the apps in the providers folder
SourceConfigs
- A table for managing different methods of acquire data from given source
- Replaces nested apps/app labels
HarvestLogs
- First class support for managing harvesting/back harvesting
Source Unique Identifiers
- First class representation of what was RawData.provider_doc_id
The Django admin now supports starting harvesters over long periods of time
Support for the MODs OAI PHM prefix

Removed

Provider Django applications have been removed
Source specific fields have been removed from ShareUser

Changed

Harvesters have been relocated to share/harvesters/
Various renaming/vocabulary changes
- RawData -> RawDatum
- Favicon -> Icon
- Provider -> Source
- Provider App -> SourceConfig
- Normalizer -> Transformer
Updates to the getting started guide
Squashed migrations to speed up local development
Harvesters are now expected to return utf-8 strings
Sources are no longer tied to the ShareUser model

[2.5.0] - 2017-03-15

Added

Title now has an "exact" multi-field in elasticsearch
A robot that archives old succeeded celery jobs
New Harvesters
- Scholarly Commons @ JMU

Fixed

Compensate for potential race conditions with the push API

[2.4.0] - 2017-02-10

Added

New Harvesters
- Research Registry Harvester
- SSOAR
Status API endpoint

Changed

Updated set_specs for University of Kansas
ClinicalTrials.gov now output registrations
Source icons are now stored in the database

Fixed

Removed "Notify" from the page title in the browsable API

[2.3.0] - 2017-02-02

Added

Support for OSF Registries
New Harvesters
- University of Utah

Changed

Updated the API
Improved Elasticsearch mappings
Updated NIH and NSFAwards
- Affiliations are now gathered
- Non-Unique URLs are no longer collected
Lots of under the hood changes to make dev's lives easier

[2.1.0] - 2016-12-16

Added

New Harvesters
- es.csic
- edu.purdue.epubs
Site status banners
Retraction harvesting
A little bit of documentation

Changed

OAuth login failure pages look nice now
Cascade deletes are now implemented as database cascades

[2.0.0] - 2016-12-02

Added

New Harvesters
- edu.cornell
- edu.richmond
- edu.scholarworks_montana
- edu.ucf
- edu.umd
- edu.utahstate
- org.seafdec
Relations between creative works
Updated harvesters
- Figshare v2 API
- PeerJ XML API
- Pubmed PMC prefix
- Datacite 4.0
BePress Taxonomy for subjects
Travis now uses postgres 9.5
Comprehensive test suite for normalization and disambiguation

Changed

Updated data model
- More expressive relations between people/organizations and works
- Type hierarchies
  - Creative works: Publication, Preprint, DataSet, Patent, Thesis, Software, etc.
  - Agents: Person, Organization, Institution, Consortium
More aggressive and intelligent data parsing
Stricter validation of incoming data
Prune duplicate objects from submitted changesets
Various bug fixes
Formalized disambiguation methods
App bootstrap time improved by 4x
Better elasticsearch mappings
- URI may now be searched/matched directly
Prettier table names

[1.0.0] - 2016-10-06

Added

Backport of the V1 push API
New and improved source registration form
JSON schema endpoint
New sources
- College of William and Mary
- University of Wisconsin

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Change Log

[25.6.1] - 2025-12-08

[25.6.0] - 2025-10-30

[25.5.0] - 2025-07-15

[25.4.0] - 2025-06-24

[25.3.3] - 2025-06-17

[25.3.2] - 2025-05-21

[25.3.1] - 2025-05-20

[25.3.0] - 2025-05-07

[25.2.0] - 2025-05-05

[25.1.1] - 2025-03-04

[25.1.0] - 2025-02-24

[25.0.0] - 2025-01-06

[24.7.0] - 2024-12-27

[24.6.2] - 2024-12-05

[24.6.1] - 2024-11-27

[24.6.0] - 2024-11-27

[24.5.0] - 2024-11-14

[24.4.0] - 2024-05-31

[24.3.1] - 2024-05-23

[24.3.0] - 2024-05-22

[24.2.0] - 2024-05-21

[24.1.0] - 2024-05-20

[24.0.0] - 2024-02-13

[23.4.1] - 2023-11-10

[23.4.0] - 2023-11-09

[23.3.0] - 2023-10-24

[23.2.0] - 2023-10-09

[23.1.9] - 2023-08-29

[23.1.8] - 2023-08-28

[23.1.7] - 2023-08-28

[23.1.6] - 2023-08-28

[23.1.5] - 2023-08-28

[23.1.4] - 2023-08-24

[23.1.3] - 2023-08-23

[23.1.2] - 2023-08-23

[23.1.1] - 2023-08-23

[23.1.0] - 2023-08-22

[23.0.6] - 2023-05-11

[23.0.5] - 2023-05-11

[23.0.4] - 2023-05-05

[23.0.3] - 2023-05-05

[23.0.0] - 2023-05-03

[22.0.1] - 2022-08-29

[22.0.0] - 2022-08-29

[21.3.1] - 2021-07-28

[21.3.0] - 2021-07-28

[21.2.2] - 2021-05-25

[21.2.1] - 2021-05-25

[21.2.0] - 2021-05-25

[21.1.4] - 2021-05-17

[21.1.3] - 2021-05-05

[21.1.2] - 2021-05-05

[21.1.1] - 2021-05-04

[21.1.0] - 2021-04-21

[21.0.8] - 2021-04-01

[21.0.7] - 2021-04-01

[21.0.6] - 2021-04-01

[21.0.5] - 2021-03-12

[21.0.4] - 2021-03-11

[21.0.3] - 2021-03-10

[21.0.2] - 2021-03-09

[21.0.1] - 2021-03-09

[21.0.0] - 2021-03-09

[20.2.0] - 2020-09-03

[20.1.0] - 2020-06-16

[20.0.4] - 2020-01-13

[20.0.3] - 2020-01-09

[20.0.2] - 2020-01-06

[20.0.1] - 2020-01-03

[20.0.0] - 2020-01-03

[19.0.6] - 2019-12-06

[19.0.5] - 2019-12-06

[19.0.4] - 2019-02-25