High acquisition rate and internal validity in the Scandinavian Obesity Surgery Registry

Background: The Scandinavian Obesity Surgery Registry (SOReg) is a national quality register that has collected data on bariatric surgery in Sweden since 2007. Objective: Evaluate the acquisition rate and internal validity of entered data in SOReg as well as completeness. Settings: National quality register, Sweden. Method: The acquisition rate during 2012–2018 was compared with Swedish national databases, while registered data in 89 selected variables (67 mandatory) was compared with medical records of 1860 randomly selected patients from all bariatric centers (n 5 39–43) reporting to SOReg. The evaluation was done by 1 independent observer. Completeness of data in the entire registry for the same time period was studied. Results: The acquisition ratewas 97.4%, because 33,716 of 34,628 patients recorded in the National Inpatient Registry were registered in SOReg. Exact agreement of reabstracted data was seen in 99.0% of 100,200 unique entries. All studied variables had an almost perfect agreement with Cohen’s kappa ranging from .87–1, that is, . .81 according to Landis and Koch criteria. In addition, .3% (n 5 301) missing data entries were discovered, mostly in administrative variables. In the mandatory variables, overall completeness was high; however, declining with time in parallel with reduced follow-up rate, 50% at 5 years. Conclusion: The high acquisition rate and internal validity implies that SOReg reﬂects Swedish bariatric surgery on a nationwide basis. Hence, SOReg data can be used to monitor quality of care and in research.

Reliable databases form the foundation of modern research and healthcare planning.Because collection of clinical data in greater volumes is needed to monitor and improve quality of care, national quality registers, focusing on a specific diagnosis or intervention, have been started worldwide.In 2004, the Scandinavian Obesity Surgery Registry (SOReg) was founded as a national quality register for bariatric surgery in Sweden [1].The aim was 2-fold: to promote quality of Swedish bariatric surgery, and to become a source for scientific development of new knowledge in the field.All registrations are based on the unique personal identification number (PIN), routinely used in the Swedish healthcare system.This PIN allows cross-matching to various high-quality national registers run by the National Board of Health and Welfare, such as the National Patient Register (NPR) [2], containing all given inpatient care.
Currently, SOReg contains clinical information on more than 75,000 patients who have undergone bariatric surgery in Sweden since 2007 [3].The database is divided into 3 separate parts: baseline data, perioperative results (including the first 30 postoperative days), and long-term follow-up (1, 2, 5, and recently also, 10 years) [1].The register contains 294 primary variables, of which 154 are mandatory.In addition, there are 41 automatically calculated variables, such as body mass index (BMI), excess weight loss (EWL), and 353 secondary and tertiary variables [4] (Supplementary Table 1).The variables include administrative data, patient demographic characteristics, medical conditions such as obesity-related diseases, and operative data, including shortand long-term complications, as well as laboratory data and patient-scored quality of life measured by the 36-Item Short Form Health Survey developed by RAND (RAND-36) [5] and the Obesity-related Problems Scale (OP) [6].All variables, as well as their outcomes, are defined, and data entry is done online.Since 2013, all bariatric centers, both public and private, have participated in SOReg.When studying registered bariatric procedures in 2011, an external validation of surgical procedure codes found high accuracy (97%) between SOReg and NPR.The authors concluded that SOReg was a reliable source to identify patients who have undergone bariatric surgery in Sweden [7].Although acquisition rate was high, internal validity, that is, the correctness of entered data, is of even greater importance, especially if the register is to be used for research and other purposes that demand accurate data.
The present study aimed to validate SOReg data from 2012-2018 in 3 different aspects: overall acquisition rate, internal validity in a selected group of patients, and completeness of data in the entire database.

Acquisition rate
The acquisition rate for 2012-2018 was calculated by dividing the registered number of patients in SOReg by the number of patients having had bariatric surgery according to NPR.A few private hospitals, 5-9 during the study period, who did not report to NPR are described separately.

Internal validity
For validation of data correctness, 1 independent observer (specialist nurse with special interest in bariatric surgery) compared registered data with medical records in 1860 patients from all Swedish bariatric centers (n 5 39-43 during the study period).Patients were randomly selected, 10-25 for each hospital, depending on the number of performed bariatric procedures.In total, 89 core variables, both mandatory (n 5 66) and nonmandatory (n 5 23), were selected for validation.The selected variables were chosen to mirror most of the subgroups in the registry, that is, administrative data, patient demographic characteristics, obesity-related diseases, operative data, short-and longterm complications, and laboratory data as well as the different registration points (baseline, operative procedure, and follow-up).Reabstraction was done on a paper form in 3 rounds during 2012-2018, when visiting the respective centers.If needed, an expert bariatric surgeon assisted in evaluating discrepancies.After the reabstraction was completed, all data were compared with the original registrations, and analyzed for missing values, exact agreement, and correlation.

Completeness of data
The percentage of missing data for a selected number of variables was calculated in April 2020 by using all entries on operated patients (n 5 44,777) in 2012-2018 in the entire database.

Statistics
Acquisition rate and completeness were assessed with descriptive statistics.Comparisons between groups were made by c 2 test with P , .05 considered as statistically significant.Accuracy was analyzed by calculating proportions with exact agreement; Pearson's correlation test was used for numerical values and dates, and Cohen's kappa for ordinal values.The Landis and Koch criteria was used in rating the magnitude of agreement ( ,0 as no agreement, 0-.20 as slight, .21-.40 as fair, .41-.60 as moderate, .61-.80 as substantial, and .81-1as almost perfect or perfect agreement) [8].SPSS version 23 (IBM, Armonk, New York, United States) was used.
Ethical approval was obtained from the regional ethical review board of Stockholm (Ref 2017/857-32).

Acquisition rate
As shown in Table 1, the acquisition rate was consistently high (96.6%-98.1%)during 2012-2018, with a mean acquisition rate of 97.4% of patients also registered in NPR.Furthermore, SOReg contained 25% more patients having had bariatric surgery than NPR, the majority from private hospitals (not reporting to NPR).These data were excluded from the calculations.
In general, incorrect values were found in all variables, at all 5 time points, and in all 3 validation rounds.However, a slightly higher number of incorrect values was found in the numerical variables, for example, weights, and dates, compared with the categorical variables, for example, complications and obesity-related diseases.In both weights and dates, the percentage of incorrect values decreased from the first to the third validation round: weights from 4.0% (114 of 2865) to 2.3% (57 of 2494) and dates from 2.3% (65 of 2865) to 1.8% (46 of 2494), with P , .05 for both.Moreover, the median difference between the registered entry and the correct value for weight and dates was rather small, 5 (1-59) kg and 59 (1-548) days, respectively.
During the evaluation, .3%missing values (n 5 301) were found in the mandatory variables.Data were somewhat more often lacking in groups, for example, several variables from a specific time point (visit) or the same variable from a specific center.

Completeness
The completeness of mandatory variables was high at baseline, while the percent of registered patients at the 1-, 2-, and 5-year follow-up decreased by time.The 2-and 5year follow-up was close to 70% and 50%, respectively.However, data were entered in the mandatory variables for almost all patients having a follow-up visit (Table 3).The completeness of some variables may be slightly higher than in the validation part of the study because the revealed missing values were corrected when possible.
The nonmandatory variables contained about 3 quarters of possible data at baseline, with a lower rate for the variables containing laboratory data and scores from the two quality-of-life questionnaires (RAND-36 and OP).Although the completeness decreased in parallel with the reduced follow-up, most variables had 50% or more of expected data at 5 years.

Discussion
This study has shown a very high acquisition rate and internal validity in SOReg.The completeness of mandatory variables was also very high at baseline and for patients having a registered follow-up; however, the latter declines over time.The originally planned, and ongoing, validation process has been of considerable value for the registry.

Acquisition rate
In accordance with the earlier external validation by Tao in 2011 [7], the acquisition rate for the present study period (2012-2018) was shown still to be high (97.4%).The possibility to correctly calculate acquisition rates depends on the quality of the second source, used as denominator.Because it is mandatory for all public healthcare providers in Sweden to report all given hospital care to the NPR, we had access to high-quality national data [2].However, since a rather large proportion of the bariatric procedures was done by private healthcare providers not required to report their cases to the National Board of Health and Welfare, we could only calculate the acquisition rate for nonprivate cases.On the other hand, this implies that SOReg can present a much more complete description of bariatric surgery in Sweden than NPR.This is of specific importance when reporting annual volumes of bariatric surgery to international surveys [9,10] or multinational registries, such as the IFSO Global Registry [11,12], run by the International Federation for the Surgery of Obesity and Metabolic Disorders.
Studies on national acquisition rates in bariatric surgery are scarce.In 2017, the Israel National Bariatric Surgery Registry showed a 98.7% acquisition rate when comparing their 40,815 registered bariatric procedures with hospital records [13].Using the same methodology as the present (comparing with NPR), 51,999 of 64,538 (82.9%)Swedish cholecystectomies were registered in their corresponding national quality register [14].Several Swedish quality The percentage of SOReg operations found in the National Inpatient Registry (NPR) were divided by the total number of operations in the NPR.Note the rather large proportion of operations in SOReg that are not registered in NPR.registries focusing on malignant diagnoses have compared their data to another national register, the Cancer Register, also run by the National Board of Health and Welfare.In 2015, the National Prostate Cancer Register was found to capture 98% of 731 selected prostate cancer cases [15] and a similar high acquisition rate (95.5% of 6354 patients) was seen in the Swedish National Register for Oesophageal and Gastric Cancer [16].Recently, 3 additional cancer registries, the Swedish Colorectal Cancer Registry [17], the Swedish Quality Register of Gynaecologic Cancer [18], and the Swedish National Register for Breast Cancer [19], reported high acquisition rates (95.0%-99.9%).Furthermore, in a SOReg study of obesity-related co-morbidity over 5 years, with cross linkage to a third national registry, the Prescribed Drug Register, the 2 registers were found to closely correlate [20].Thus, the constant use of PIN in the Swedish healthcare system as well as in quality registers is invaluable, the latter also allows cross-matching between different quality registers [21][22][23][24].
Outside of the present study, we annually check for discrepancies between SOReg and NPR.In most cases, we find few patients who have been operated on for a complication (registered correctly as such in SOReg), to be erroneously reported as having had a first-time bariatric procedure in governmental registry.Furthermore, some palliative procedures, for example, bypassing an unresectable distal gastric cancer, have mistakenly been reported as a bariatric gastric bypass.The high consistency is in line with an external review of the NPR itself, showing a positive predictive value of 85%-95% in many diagnoses [25], but a slightly lower predictive value in bariatric surgery due to the relatively large share of private cases (25%).In the handful of true missing bariatric cases in SOReg, we ask the center concerned to enter the missing patient in our registry, thus increasing the acquisition rate to slightly more than the present 97.4%.

Validation
To our knowledge only 3 national registries on bariatric surgery perform structured and regular validation: SOReg Sweden, the Norwegian equivalent of SOReg, and the Dutch Audit for Treatment of Obesity (DATO) [26].In a combined analysis of DATO and 6 other quality registers by the Dutch Institute for Clinical Auditing (DICA), data accuracy ranged from 88.2%-100%, with most discrepancies in postoperative complications (.7%-7.5%)[27].When comparing registry data on co-morbidities in the Israel National Bariatric Surgery Registry against hospital files, the following Cohen's kappa coefficients were reported: hypertension and diabetes (.8) and sleep apnea (.7) [13].Although these coefficients correspond to substantial agreement on the Landis and Koch scale (.61-.80), they are not as high as in the present study.In an early evaluation of the large American database created by the Metabolic and Bariatric Surgery Accreditation and Quality Improvement Program (MBSAQIP), the authors warned that up to 20% of data may be unusable for analysis due to data quality issues [27].
The high agreement between our reabstracted data and the original data implies that the daily registration in clinical practice works well.Before launching the register, considerable care was taken to create clear, robust definitions for the selected variables, and not to overload the register.In achieving a functional register already from the start some founding members first used a pilot version to ensure that all potential errors or uncertainties could be found and corrected.SOReg holds annual registry meetings, discussing uncertainties in the register and suggestions for improved care of bariatric patients.This resulted in the swift integration of new national recommendations to close the mesenteric defects in gastric bypass [28,29] and to avoid optic trocars, thus reducing the risk for aortic injuries in laparoscopic procedures [30,31].
The categorical variables, such as obesity-related diseases, clearly defined by the continuous use of specific drugs, had high agreement.Interestingly, the 2 nonmandatory variables for musculoskeletal pain or other obesityrelated diseases influencing the decision for bariatric surgery did not have inferior agreement compared with the 7 mandatory variables.When considering how to improve the more difficult numerical variables it is necessary to assess whether the small differences between the registered entry and the correct value are clinically important.Furthermore, in dates, only 1 incorrectly entered digit leads to a rather large discrepancy, for example, always at least 30 days in the registration of the month.It was not surprising that the more complex variable, Clavien-Dindo, combining postoperative complications into an overall grade had a relatively lower validity (k 5 .873)compared with the different postoperative complications as single items (k 5 .937).Most importantly, the percentage of incorrect values decreased from the first to the third validation round, thus improving the quality of the register by the validation process itself.
Registry validation can be done with different methods such as comparing with other registries, continuous monitoring by dedicated personnel, examining by external validators, or by reabstracting data from medical records.The reabstraction method was chosen in the present study because other Swedish quality registers have been successfully validated in a similar way.In the Swedish Registry for Gallstone Surgery, correctness of data in a sample of 94,919 entries was 98.2% [14].The agreement between medical charts and registered data has also been high in the Swedish malignant registries, 82.0%-91.1%.However, the number of patients studied or data has been lower in these evaluations than in the present study, prostate cancer (731 patients) [15], esophageal and gastric cancer (60 variables, 12,035 original entries) [16], colorectal cancer (486 patients, 130 variables) [17], gynecologic cancer (500 patients, 31 variables) [18], and breast cancer (800 patients) [19].

Completeness
For operated patients, a completeness close to 100% was found at baseline in the mandatory variables, while the percent of patients having a registered follow-up decreased by time.The 2-and 5-year follow-up was close to 70% and 50%, respectively.However, data were entered in the mandatory variables for almost all patients having a follow-up visit.One explanation for the small amount of missing data in some mandatory variables is possibility to overrun a missing mandatory variable with affirmative action.
The overall completeness of SOReg could be improved by increased registration of follow-up data; that is, data that we know are stored in medical charts after clinical visits in primary care, but not reported to the register.To achieve this, we are working on automatic transferal of laboratory data from all types of electronic medical records as well as an app to collect patient-reported data directly from the operated individual.

Additional aspects
We believe that the following factors have contributed to the successful implementation of SOReg.First, considerable effort was put into ensuring the register was userfriendly, thus all potential variables were scrutinized to prevent the register being overloaded by unnecessary or illdefined variables.In contrast to some other Swedish quality registers, a minimum of variables was chosen, thus allowing future expansion in required areas during the run of the register [4] (Supplementary Table 1).Second, several technical details are included in the data platform to improve data quality.There are built-in calculations; for example, of BMI and age at surgery to minimize errors in manual calculations.The register also contains built-in warnings for unusual data, and built-in blockage of illogical data, such as baseline date after operation date, systolic blood pressure lower than diastolic pressure, weight below 35 kg.Furthermore, variables are constructed in a hierarchic way thus removing nonapplicable data questions during the registration, and the register involves regular scanning of the database for unusual data and unusual data combinations, for example, postoperative hospital stay .4days without a registered complication.Last, on first entry of data the register is cross run with the national population register to check whether the patient name is correct, and to capture place of residence to district level and to stop doubled registrations.Every month the register is cross run with the population register to capture mortality, thus enabling 100% mortality data.
To date, 95 scientific studies have been done on SOReg data and the acquired knowledge has recently been summarized for clinicians [32], with a special report on the effects of bariatric surgery on diabetes [33].The possibility to link our registry data to other national registers by the PIN has been invaluable for these projects.A complete register can be used as a base for observational studies, thus allowing important comparisons between the results on selected patients in randomized controlled trials and those achieved in routine care [34].A well-established quality register can also harbor randomized control trials (RCT) within itself, i.e., registry RCTs.Several R-RCTs have been done in SOReg, for example, on internal herniation in gastric Results are presented for the 3 validation periods.
bypass [28,29] and the ongoing study on equipoise between gastric bypass and gastric sleeve (the Bypass Equipoise Sleeve Trial, BEST) [35].Beyond these technical and more formal aspects of register quality the most important success factor is user engagement.Users need to feel the register is meaningful and contributes to their clinical work.It is crucial that users consider the register handles data honestly, ethically, and in the interest of patients to ensure that the register can be of high quality.A successful register must also continuously provide updated reports, which are easy to access and understand, of analyzed data [3,32,[36][37][38].

Limitations
Among the strengths of this study are the large number of unique entries (n 5 100,200), all reabstracted by 1 examiner during a continuous validation process with regular on-site visits to all operating centers.The process allows access to all information in the electronic medical records, not only selected data on shared paper copies.However, despite the large number of studied patients (n 5 1860), only about 4% of all individuals who have had bariatric surgery in Sweden during the study period were evaluated.In a sensitivity analysis, the selected patients did not differ from the other patients in SOReg concerning age, gender, BMI. and type of bariatric procedure.Moreover, the validation was done on different types of variables (e.g., administrative and medical data), at various time points (baseline to 2-year follow-up) and contained mandatory as well as nonmandatory data.
In conclusion, the high acquisition rate (97.4%) implies that SOReg reflects Swedish bariatric surgery on a nationwide basis, while the high internal validity (99.0%) makes the use of SOReg data reliable for research.The overall completeness could be improved by increased registration of follow-up data already captured in medical records and other digital systems.

Table 1
Acquisition rate in the Scandinavian Obesity Registry (SOReg) in 2012-2018

Table 3
Detailed analysis of 61 mandatory and 19 nonmandatory variables concerning completeness of entered data at various time points