FHIRBase is a PostgreSQL extension for storing and retrieving FHIR resources. You can interact with FHIRBase using any PostgreSQL client. We advise you to start with pgAdmin, because it has easy-to-use graphical interface. However, other options are available.
SQL is the language in which you "talk" to FHIRBase. If you don't have at least basic knowledge of SQL, we strongly advise to read some books or tutorials on the Web in the first place.
We assume that you have successfully installed FHIRBase using Vagrant and already configured connection parameters in your PostgreSQL client.
In SQL world it's conventional to insert data with INSERT
statement,
delete it with DELETE
, update with UPDATE
and so on. FHIRBase uses
less common approach - it forces you to use
stored procedures for
data manipulation. Reason for this is that FHIRBase needs to perform
additional actions on data changes in order to keep FHIR-specific
functionality (such as
search and
versioning)
working.
There are some exceptions from this rule in data retrieval cases. For
example you can SELECT ... FROM resource
to search for specific
resource or set of resources. But when you create, delete or modify
something, you have to use corresponding stored procedures
(hereinafter, we'll refer them as SP).
SQL has strict type checking, so SP's arguments and return values are
typed. When describing SP, we will put type of every argument in
parens. For example, if argument cfg
has jsonb
type, we'll write:
You can take a look at overview of standard PostgreSQL types.
FHIR standard allows to use two formats for data exchange: XML and JSON. They are interchangeable, what means any XML representation of FHIR resource can be unambiguously transformed into equivalent JSON representation. FHIRBase team has released a XSL 2.0 stylesheet for such purpose.
Considering interchangeability of XML and JSON FHIRBase team decided to discard XML format support and use JSON as only format. There are several advantages of such decision:
When SP's argument has type jsonb
, that means you have to pass some
JSON as a value. To do this, you need to represent JSON as
single-line PostgreSQL string. You can do this in many ways, for
example, using a
online JSON formatter. Copy-paste your
JSON into this tool, cick "Remove white space" button and copy-paste
result back to editor.
Another thing we need to do before using JSON in SQL query is quote escaping. Strings in PostgreSQL are enclosed in single quotes. Example:
SELECT 'this is a string';
?column?
------------------
this is a string
(1 row)
If you have single quote in your string, you have to double it:
SELECT 'I''m a string with single quote!';
?column?
---------------------------------
I'm a string with single quote!
(1 row)
So if your JSON contains single quotes, Find and Replace them with two single quotes in any text editor.
Finally, get your JSON, surround it with single quotes, and append
::jsonb
after closing quote. That's how you pass JSON to PostgreSQL.
SELECT '{"foo": "i''m a string from JSON"}'::jsonb;
jsonb
-----------------------------------
{"foo": "i'm a string from JSON"}
(1 row)
Right after installation FHIRBase is "empty", it doesn't have any data we can operate with. So let's create some resources first.
Resources are created with fhir_create SP which takes four arguments:
Returns (jsonb): Bundle containing newly created Resource.
Following query will create a Patient resource from standard FHIR example without any tags:
SELECT fhir.create(
'{"resourceType":"Patient","identifier":[{"use":"usual","label":"MRN","system":"urn:oid:1.2.36.146.595.217.0.1","value":"12345","period":{"start":"2001-05-06"},"assigner":{"display":"Acme Healthcare"}}],"name":[{"use":"official","family":["Chalmers"],"given":["Peter","James"]},{"use":"usual","given":["Jim"]}],"telecom":[{"use":"home"},{"system":"phone","value":"(03) 5555 6473","use":"work"}],"gender":{"coding":[{"system":"http://hl7.org/fhir/v3/AdministrativeGender","code":"M","display":"Male"}]},"birthDate":"1974-12-25","deceasedBoolean":false,"address":[{"use":"home","line":["534 Erewhon St"],"city":"PleasantVille","state":"Vic","zip":"3999"}],"contact":[{"relationship":[{"coding":[{"system":"http://hl7.org/fhir/patient-contact-relationship","code":"partner"}]}],"name":{"family":["du","Marché"],"_family":[{"extension":[{"url":"http://hl7.org/fhir/Profile/iso-21090#qualifier","valueCode":"VV"}]},null],"given":["Bénédicte"]},"telecom":[{"system":"phone","value":"+33 (237) 998327"}]}],"managingOrganization":{"reference":"Organization/1"},"active":true}'::jsonb,
);
fhir.create
---------------------------------------------------------------------------------
{"id": "8d33a19b-af36-4e70-ae64-e705507eb074", "name": ....}
When resource is created, FHIRBase assigns unique identifier to it. We need to "remember" (copy-paste) this identifier for later use. Copy-paste this ID somewhere, because we'll need it in the next step.
To read latest version of Resource use fhir.read SP:
Use following code to invoke fhir.read
, just replace [URL]
with
Patient's identifier from previous step:
SELECT fhir.read('Patient', '[URL]');
fhir.read
---------------------------------------------------------------------------------
{"id": "b1f2890a..",...}
Instead of invoking fhir.read SP, you can SELECT
resource data
from resource
table.
Replace [logical ID]
in following query with logical ID of
previously inserted Patient resource and execute it.
SELECT content FROM resource
WHERE logical_id = '[logical ID]'
AND resource_type = 'Patient';
content
---------------------------------------------------------------------------------
{"name": [{"use": "official", "given": ["Peter", "James"], "family": ["Chalmers"]},
{"use": "usual", "given": ["Jim"]}],
[... skipped ...]
resource
table contains latest versions of all resources stored in
FHIRBase. It must be said that no data is stored in it. Each type of
resource has it's own table: patient
, adversereaction
,
encounter
, etc (48 total) where all resource data is actually
stored. Each of them inherits resource
table using
PostgreSQL Table Inheritance
feature. So when you SELECT ... FROM resource
, PostgreSQL executes
your query on every inherited table and then union results. Such
approach might be inefficient, especially on complex queries, and
that's why it's important to use WHERE resource_type = '...'
predicate. When you specify resource_type
, PostgreSQL knows exactly
which inherited table to touch. Alternatively, you can select directly
from inherited table:
SELECT content FROM patient
WHERE logical_id = '[logical ID]';
content
---------------------------------------------------------------------------------
{"name": [{"use": "official", "given": ["Peter", "James"], "family": ["Chalmers"]},
{"use": "usual", "given": ["Jim"]}],
[... skipped ...]
Generally, SELECT
ing data from resource
table by logical ID and
resource type is as fast as SELECT
ing from inherited table by
logical ID only.
To update resource data use fhir_update SP:
To read latest version of resource use already discussed fhir.read SP:
SELECT fhir.read('Patient', '[logical id]');
fhir.read
----------------------------------------------------------------------------
{"id": "b1f2890a-0536-4742-9d39-90be5d4637ee",...}
TODO: write about meta.versionId
Now let's invoke fhir_update
with version URL we just received and
change Patient.text value:
SELECT fhir_update(updated_resource) ;
fhir_update
-------------------------------------------------------------------------
{"id": "b1f2890a-0536-4742-9d39-90be5d4637ee", ....}
If meta.versionId in resource you passed to fhir.update
isn't latest (optimistic locking has failed), you'll receive error message:
ERROR: Wrong version_id 43d7c2cf-a1b5-4602-b9a2-ec55d1a2dda8. Current is abb33ccc-bb5a-4875-af43-9b3bba62a95c
To receive all versions of specific resource use fhir.history SP:
Invoking fhir_history is quite straightforward:
SELECT fhir_history(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'[URL]',
'{}'::jsonb);
fhir_history
---------------------------------------------------------------------------
[... skipped ...]
Also there is a fhir_vread SP to read single version of some resource:
SELECT fhir_vread(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'[version URL]');
fhir_vread
----------------------------------------------------------------------------
{"id": "34d2ec09-9211-4c95-a591-905279cc8212", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee", "link": [{"rel": "self", "href": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee/_history/43d7c2cf-a1b5-4602-b9a2-ec55d1a2dda8"}],
[... skipped ...]
Search is the most tricky part of FHIR Standard. FHIRBase implements most of Search features:
We'll demonstrate how to perform simple search and will leave other cases for separate article.
Search is performed with fhir_search SP:
Let's start with searching for all patients with name containing "Jim":
SELECT fhir_search(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'name=Jim');
fhir_search
----------------------------------------------------------------------------
{"id": "3367b97e-4cc3-4afa-8d55-958ed686dd10", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee",
[ ... skipped ... ]
"resourceType": "Bundle", "totalResults": 1}
Search by MRN identifier:
SELECT fhir_search(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'identifier=urn:oid:1.2.36.146.595.217.0.1|12345');
fhir_search
----------------------------------------------------------------------------
{"id": "3367b97e-4cc3-4afa-8d55-958ed686dd10", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee",
[ ... skipped ... ]
"resourceType": "Bundle", "totalResults": 1}
Search by gender:
SELECT fhir_search(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'gender=http://hl7.org/fhir/v3/AdministrativeGender|M');
fhir_search
----------------------------------------------------------------------------
{"id": "3367b97e-4cc3-4afa-8d55-958ed686dd10", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee",
[ ... skipped ... ]
"resourceType": "Bundle", "totalResults": 1}
Combining several conditions:
SELECT fhir_search(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'gender=http://hl7.org/fhir/v3/AdministrativeGender|M&name=Jim&identifier=urn:oid:1.2.36.146.595.217.0.1|12345');
fhir_search
----------------------------------------------------------------------------
{"id": "3367b97e-4cc3-4afa-8d55-958ed686dd10", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee",
[ ... skipped ... ]
"resourceType": "Bundle", "totalResults": 1}
To delete resource, use fhir_delete SP:
SELECT fhir_delete(
'{"base": "http://localhost.local"}'::jsonb,
'Patient',
'[URL]'
);
fhir_delete
----------------------------------------------------------------------------
{"id": "3367b97e-4cc3-4afa-8d55-958ed686dd10", "entry": [{"id": "http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee",
[ ... skipped ... ]
"resourceType": "Bundle", "totalResults": 1}
NB: History of resource is also deleted:
SELECT fhir_history(
'{"base": "http://localhost.local"}'::jsonb,
'http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee',
'{}'::jsonb
);
fhir_history
----------------------------------------------------------------------------
{"id": "6d31ea93-49b3-47d2-9249-1f47d1e72c39", "entry": [], "title": "History of resource with type=http://localhost.local/Patient/b1f2890a-0536-4742-9d39-90be5d4637ee", "updated": "2014-11-25T12:42:47.634399+00:00", "resourceType": "Bundle", "totalResults": 0}
FHIR is a specification of semantic resources and API for working with healthcare data. Please address the official specification for more details.
To implement FHIR server we have to persist & query data in an application internal format or in FHIR format. This article describes how to store FHIR resources in a relational database (PostgreSQL), and open source FHIR storage implementation - FHIRbase.
FHIRbase is built on top of PostgreSQL and requires its version higher than 9.4 (i.e. jsonb support).
FHIR describes ~100 resources as base StructureDefinitions which by themselves are resources in FHIR terms.
To setup FHIRbase use Installation Guide.
FHIRbase stores each resource in two tables - one for current version and second for previous versions of the resource. Following a convention, tables are named in a lower case after resource types: Patient => patient, StructureDefinition => structuredefinition.
For example Patient resources are stored in patient and patient_history tables:
fhirbase=# \d patient
Table "public.patient"
Column | Type | Modifiers
---------------+--------------------------+-------------------------
version_id | text |
logical_id | text | not null
resource_type | text | default 'Patient'::text
updated | timestamp with time zone | not null default now()
published | timestamp with time zone | not null default now()
category | jsonb |
content | jsonb | not null
Inherits: resource
fhirbase=# \d patient_history
Table "public.patient_history"
Column | Type | Modifiers
---------------+--------------------------+-------------------------
version_id | text | not null
logical_id | text |
resource_type | text | default 'Patient'::text
updated | timestamp with time zone | not null default now()
published | timestamp with time zone | not null default now()
category | jsonb |
content | jsonb | not null
Inherits: resource_history
All resource tables have similar structure and are inherited from resource table, to allow cross-table queries (for more information see PostgreSQL inheritance).
Minimal installation of FHIRbase consists of only a few tables for "meta" resources:
These tables are populated with resources provided by FHIR distribution.
Most of API for FHIRbase is represented as functions in fhir schema, other schemas are used as code library modules.
First helpful function is fhir.generate_tables(resources text[])
which generates tables
for specific resources passed as array.
For example to generate tables for patient, organization and encounter:
psql fhirbase
fhirbase=# select fhir.generate_tables('{Patient, Organization, Encounter}');
-- generate_tables
-----------------
-- 3
-- (1 row)
If you call generate_tables() without any parameters, then tables for all resources described in StructureDefinition will be generated:
fhirbase=# select fhir.generate_tables();
-- generate_tables
-----------------
-- 93
--(1 row)
When concrete resource type tables are generated, column installed for this resource is set to true in the profile table.
SELECT logical_id, installed from structuredefinition
WHERE logical_id = 'Patient'
-- logical_id | installed
----------------------------
-- Patient | true
Functions representing public API of FHIRbase are all located in the FHIR schema. The first group of functions implements CRUD operations on resources:
SELECT fhir.create('{"resourceType":"Patient", "name": [{"given": ["John"]}]}')
-- {
-- "id": "c6f20b3a...",
-- "meta": {
-- "versionId": "c6f20b3a...",
-- "lastUpdated": "2015-03-05T15:53:47.213016+00:00"
-- },
-- "name": [{"given": ["John"]}],
-- "resourceType": "Patient"
-- }
--(1 row)
-- create - insert new row into patient table
SELECT resource_type, logical_id, version_id,* from patient;
-- resource_type | logical_id | version_id | content
---------------+---------------------------------------------------
-- Patient | c6f20b3ab... | c6f20b.... | {"resourceType".....}
--(1 row)
SELECT fhir.read('Patient', 'c6f20b3a...');
-- {
-- "id": "c6f20b3a...",
-- "meta": {
-- "versionId": "c6f20b3a...",
-- "lastUpdated": "2015-03-05T15:53:47.213016+00:00"
-- },
-- "name": [{"given": ["John"]}],
-- "resourceType": "Patient"
-- }
--(1 row)
SELECT fhir.update(
jsonbext.merge(
fhir.read('Patient', 'c6f20b3a...'),
'{"name":[{"given":"Bruno"}]}'
)
);
-- returns update version
SELECT count() FROM patient; => 1
SELECT count() FROM patient_history; => 1
-- read previous version of resource
SELECT fhir.vread('Patient', /*old_version_id*/ 'c6f20b3a...');
SELECT fhir.history('Patient', 'c6f20b3a...');
-- {
-- "type": "history",
-- "entry": [{
-- "resource": {
-- "id": "c6f20b3a...",
-- "meta": {
-- "versionId": "a11dba...",
-- "lastUpdated": "2015-03-05T16:00:12.484542+00:00"
-- },
-- "name": [{
-- "given": "Bruno"
-- }],
-- "resourceType": "Patient"
-- }
-- }, {
-- "resource": {
-- "id": "c6f20b3a...",
-- "meta": {
-- "versionId": "c6f20b3a...",
-- "lastUpdated": "2015-03-05T15:53:47.213016+00:00"
-- },
-- "name": [{
-- "given": ["John"]
-- }],
-- "resourceType": "Patient"
-- }
-- }],
-- "resourceType": "Bundle"
-- }
SELECT fhir.is_exists('Patient', 'c6f20b3a...'); => true
SELECT fhir.is_deleted('Patient', 'c6f20b3a...'); => false
SELECT fhir.delete('Patient', 'c6f20b3a...');
-- return last version
SELECT fhir.is_exists('Patient', 'c6f20b3a...'); => false
SELECT fhir.is_deleted('Patient', 'c6f20b3a...'); => true
When resource is created, logical_id and version_id are generated as uuids. On each update resource content is updated in the patient table, and old version of the resource is copied into the patient_history table.
Next part of API is a search API. Folowing functions will help you to search resources in FHIRbase:
select fhir.search('Patient', 'given=john')
-- returns bundle
-- {"type": "search", "entry": [...]}
-- return search as relatio
select * from fhir._search('Patient', 'name=david&count=10');
-- version_id | logical_id | resource_type
------------+----------------------------------
-- | "a8bec52c-..." | Patient
-- | "fad90884-..." | Patient
-- | "895fdb15-..." | Patient
-- expect generated by search sql
select fhir.search_sql('Patient', 'given=david&count=10');
-- SELECT * FROM patient
-- WHERE (index_fns.index_as_string(patient.content, '{given}') ilike '%david%')
-- LIMIT 100
-- OFFSET 0
-- explain query execution plan
select fhir.explain_search('Patient', 'given=david&count=10');
-- Limit (cost=0.00..19719.37 rows=100 width=461) (actual time=6.012..7198.325 rows=100 loops=1)
-- -> Seq Scan on patient (cost=0.00..81441.00 rows=413 width=461) (actual time=6.010..7198.290 rows=100 loops=1)
-- Filter: (index_fns.index_as_string(content, '{name,given}'::text[]) ~~* '%david%'::text)
-- Rows Removed by Filter: 139409
-- Planning time: 0.311 ms
-- Execution time: 7198.355 ms
Search works without indexing but search query would be slow on any reasonable amount of data. So FHIRbase has a group of indexing functions:
Indexes are not for free - they eat space and slow inserts and updates. That is why indexes are optional and completely under you control in FHIRbase.
Most important function is fhir.index_search_param
which
accepts resourceType as a first parameter, and name of search parameter to index.
select count(*) from patient; --=> 258000
-- search without index
select fhir.search('Patient', 'given=david&count=10');
-- Time: 7332.451 ms
-- index search param
SELECT fhir.index_search_param('Patient','name');
--- Time: 15669.056 ms
-- index cost
select fhir.admin_disk_usage_top(10);
-- [
-- {"size": "107 MB", "relname": "public.patient"},
-- {"size": "19 MB", "relname": "public.patient_name_name_string_idx"},
-- ...
-- ]
-- search with index
select fhir.search('Patient', 'name=david&count=10');
-- Time: 26.910 ms
-- explain search
select fhir.explain_search('Patient', 'name=david&count=10');
------------------------------------------------------------------------------------------------------------------------------------------------
-- Limit (cost=43.45..412.96 rows=100 width=461) (actual time=0.906..6.871 rows=100 loops=1)
-- -> Bitmap Heap Scan on patient (cost=43.45..1569.53 rows=413 width=461) (actual time=0.905..6.859 rows=100 loops=1)
-- Recheck Cond: (index_fns.index_as_string(content, '{name}'::text[]) ~~* '%david%'::text)
-- Heap Blocks: exact=100
-- -> Bitmap Index Scan on patient_name_name_string_idx (cost=0.00..43.35 rows=413 width=0) (actual time=0.205..0.205 rows=390 loops=1)
-- Index Cond: (index_fns.index_as_string(content, '{name}'::text[]) ~~* '%david%'::text)
-- Planning time: 0.449 ms
-- Execution time: 6.946 ms
relname | size -------------------------------------------------+--------- public.observation | 22 GB public.observationversionidkey | 1063 MB public.observationpkey | 1063 MB public.patient | 454 MB public.patientnamenametokenidx | 140 MB public.patientbirthdatebirthdate_dateidx | 97 MB public.patientversionidkey | 89 MB public.patientpkey | 87 MB public.patientfamilyfamilytokenidx | 72 MB public.patientgivengiventokenidx | 60 MB gen.family | 7504 kB public.patientidentifieridentifiertokenidx | 3056 kB gen.given | 2608 kB public.patientgendergendertokenidx | 2448 kB public.patientactiveactivetokenidx | 2440 kB public.patientlanguagecommunicationtokenidx | 2432 kB public.patientanimalbreedbreedtokenidx | 2432 kB public.patientlinkothertokenidx | 2432 kB public.patienttelecomtelecomtokenidx | 2432 kB public.patientanimalspeciesspeciestokenidx | 2432 kB (20 rows)
27145284 (1 row)
Quick installation on ubuntu:
sudo apt-get install -qqy postgresql-9.4 postgresql-contrib-9.4 curl python
sudo su postgres -c 'createuser -s <you-local-user>'
export PGUSER=<you-local-user>
export DB=test
git clone https://github.com/fhirbase/fhirbase
cd fhirbase
./runme integrate
Code is split into two categories -
schema changing code (migrations in schema/
directory) and reloadable functions (in src/
).
To reduce sql boilerplate and modularize application
we use simple SQL preprocessor written in python (in ql/
directory).
All tasks for fhirbase could be done using runme
script in form
bash
env DB=test ./runme <command> <args>
Here is the list of subcomands:
PostgreSQL has one-level namespaces - schemas.
We use schemas to implement code modules, this allow create and reload modules as a whole.
Every file in src/
is mapped to schema with same name as file.
For example functions defined in src/fhir.sql
will be in fhir schema.
Having modules allows us introduce dependencies between modules. They implemented as magic comments in source files and using loader (./runme).
For example you have created module a.sql with function util
. Then you want to
create module b.sql, which depends on module a
:
-- file src/b.sql
-- #import ./a.sql
...
b.util()
...
Then you can load module b into databse:
./runme load src/a.sql
Loader read #import
instructions and resolve dependencies recursively,
then load all modules in right order and write module status into special table modules
.
Loader also calculate sha1 of module file
content and reload module only if hash changed (caching).
If you run load
command again without changes in files - it will do nothing.
To reduce sql boilerplate you can use some macro expansions in source files.
you can write sql functions in python style, ie body of function should be idented:
-- FILE: src/coll.sql
func _butlast(_ar_ anyarray) returns anyarray
SELECT _ar_[array_lower(_ar_,1) : array_upper(_ar_,1) - 1]
Preprocessor will produce:
-- FILE: src/coll.sql
drop schema if exists coll cascade;
create schema coll;
CREATE OR REPLACE
function coll._butlast(_ar_ anyarray) RETURNS anyarray
LANGUAGE sql AS $$
SELECT _ar_[array_lower(_ar_,1) : array_upper(_ar_,1) - 1] -- coll:3
$$ IMMUTABLE;
You can always inspect result of preprocessor by running ./runme compile src/file.sql
.
Here is list of macro expansions:
To allow update existing fhirbase installation to next version, we track schema changes
using migrations approach - all schema changes are in schema/
directory and enumerated using
timestamps. When you run /.runme migrate
all pending migrations will be applied.
There are some preprocessor sugar for tests:
-- one-line test
select 1 => 1
---will be compiled into:
SELECT tests.expect('', 'test/<file>.sql:<line-number>',(select 1),(1));
Here is also multiline form (body should be idented):
expect "Comment"
select 1
=> 1
-- compiled
SELECT tests.expect('Comment', 'test/<file>.sql:<line-number>',(select 1),(1));
To test for exeptions there is special syntax:
expect_raise 'does not exist'
SELECT crud.delete('{}'::jsonb, 'Patient', 'nonexisting')
--- compiled
SELECT tests.expect_raise('does not exist', 'test/fhir_crud_spec.sql:151',
($SQL$
SELECT crud.delete('{}'::jsonb, 'Patient', 'nonexisting') -- fhir_crud_spec:151
$SQL$));
Source code of expect functions is in src/tests.sql.