-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
Needs ClarificationOMOP ETLFor the OMOP ETLFor the OMOP ETLbugSomething isn't workingSomething isn't workingpriority:high
Description
Hi,
So I got further along with the 0.3.8 etl (https://github.com/rvandewater/meds_etl) and got this error during processing of meds_unsorted. Could this be a problem in the source data? I have the feeling there could have been some incorrect type inference that happened.
Generating metadata from OMOP `concept` table
Generating metadata from OMOP `concept` table: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:18<00:00, 18.88s/it]
Generating metadata from OMOP `concept_relationship` table: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00, 2.21s/it]
Extracting dataset metadata: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 750.32it/s]
Decompressing OMOP tables, mapping to MEDS Unsorted format, writing to disk...
0it [00:00, ?it/s]
82%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 232/282 [24:47<05:15, 6.32s/it]^[[B100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 282/282 [30:11<00:00, 6.42s/it]
Finished converting dataset to MEDS Unsorted.
Converting from MEDS Unsorted to MEDS...
Traceback (most recent call last):
File "/hpc/users/vander09/.conda/envs/meds_etl_033/bin/meds_etl_omop", line 8, in <module>
sys.exit(main())
File "/hpc/users/vander09/.conda/envs/meds_etl_033/lib/python3.10/site-packages/meds_etl/omop.py", line 798, in main
meds_etl.unsorted.sort(
File "/hpc/users/vander09/.conda/envs/meds_etl_033/lib/python3.10/site-packages/meds_etl/unsorted.py", line 303, in sort
sort_polars(source_unsorted_path, target_meds_path, num_shards, num_proc)
File "/hpc/users/vander09/.conda/envs/meds_etl_033/lib/python3.10/site-packages/meds_etl/unsorted.py", line 204, in sort_polars
update_types(property_columns_info, get_columns(task))
File "/hpc/users/vander09/.conda/envs/meds_etl_033/lib/python3.10/site-packages/meds_etl/unsorted.py", line 181, in update_types
assert v == type_dict[k], f"We got different types for column {k}, {v} vs {type_dict[k]}"
AssertionError: We got different types for column visit_id, String vs Int64
Metadata
Metadata
Assignees
Labels
Needs ClarificationOMOP ETLFor the OMOP ETLFor the OMOP ETLbugSomething isn't workingSomething isn't workingpriority:high