Anchoring Vignettes in EQ-5D-5L Questionnaire: Validation of a New Instrument

Background: Health Related Quality of Life (HRQoL) is an indicator of patient's physical, psychological and social life. HRQoL is influenced by experience, beliefs, perceptions and expectations, and measures subjective perspective of the patient himself. EQ-5D-5L and SF-12 questionnaires are validated instruments useful to measure HRQoL, increasingly administered in electronic formats. Objective: The main purpose is to evaluate the feasibility of anchoring vignettes for the EQ-5D-5L questionnaire, with the aim to improve intergroup comparability of responses among different subjects. A comparison with SF-12 questionnaire is carried out. Method: This is a cross-sectional study conducted at the ambulatories of cardiology of the University Hospital of Padova, in Italy. Thirty-eight subjects with a diagnosis of cardiovascular disease or at risk of cardiovascular disease were enrolled. A factorial analysis has been performed to assess the convergent validity of EQ-5D-5L questionnaire compared to Sf-12. Moreover, a compound Hierarchical Ordered Probit (Chopit) model has been estimated to evaluate if the questionnaire form affects the subjective evaluation process in order to compare EQ-5D-5L with and without vignettes. Results: Correlation and factor analysis demonstrate that EQ_5D questionnaire is coherent with SF-12 in paper format. Chopit model estimation shows that questionnaire format does not affect the subjective question interpretation. Moreover, in a parametric model including vignettes, education attainment, disease severity, and gender are predictors of HRQoL status. Conclusion: The EQ-5D including vignettes in electronic format seems to be a valid tool to measure HRQoL as compared to EQ-5D without vignettes in paper format and to SF-12 questionnaire.


INTRODUCTION
Health related quality of life (HRQoL) measures represent an important part of assessing the quality of routine care in general practice. They are useful in understanding the patient's point of view about the disease and the treatment methods applied, and deserve important consideration when comparing different treatments methods and evaluating interventions [1].
Nurses are educated to provide a wide range of components of care and they are aware of the importance of the quality of patients' lives because nursing is holistically concerned with the whole patient and is a caring practice aimed to health promotion and maintenance or restoration of function [2].
There is a growing evidence indicating that 'quality of life assessment' is particularly relevant to the scope of nursing practice [3] and can be considered as adjuvant to clinical and physiological assessments in many chronic conditions, particularly cancers [4] and cardiovascular diseases [5]. This approach is the 'gold standard' in the evaluation of nursing care, healthcare services and outcome assessment.
Health-related quality of life tools have the potential to identify specific and general health needs which is particularly important in nursing dealing with holistic care: components of HRQoL tools are likely to be associated with specific health care needs and measuring HRQoL may lead to improved quality of care and to improvement of patients' QoL, but the administration of quality of life tools can also provide a rapid screening in order to identify patients' health needs.
Nevertheless, one of the main barriers in using HRQoL as an outcome indicator in the comparison of different survey results is the so-called interpersonal incomparability [6]. The evaluation and decision-making process that leads a respondent to evaluate and choose among different response categories of survey questions is a quite complex phenomenon. This complexity is the result not only of the individuality, but also of socio-cultural factors: same question can be interpreted in different ways by people belonging to different cultural contexts, but this difference can be detected even among individuals belonging to the same cultural field.
These factors may hamper the comparability of survey research especially if they are related to heterogeneous study populations and if the phenomenon under investigation is somewhat abstract [7].
For these reasons, King and colleagues (2004) introduced the anchoring vignettes, a methodological tool that seeks to correct for the different interpretations that can be given to responses on ordinal scales.
The main research purpose in this article is the validation of a touch-screen format EQ-5D-5L questionnaire with anchoring vignettes, through comparison with SF-12 questionnaire in paper format without vignettes. Furthermore, we investigated whether the questionnaire form (paper or touch screen format) and/or the presence of the vignettes in EQ-5D-5L affects the interpretation process of questions proposed to respondent.

Study Design and Setting
The setting of our research is an observational cross-sectional study, conducted at the ambulatories of cardiology of the University Hospital of Padova, in the period between February and March 2015. Patients who underwent medical examination during the period of data recruitment were enrolled in the study. Subjects involved were both patients with cardiovascular disease and patients at risk of cardiovascular disease. Criteria for eligibility were the age of consent (over 18 years), the absence of any major cognitive impairment and the Italian language as a mother tongue. Verbal informed consent has been provided to all the subjects involved in the study, after an explanation of the aim of the research.

Vignettes Techniques
The vignettes are fictitious questions concerning persons who live in a situation attributed to the phenomenon under investigation (an example is provided in Box 1). The vignettes are proposed to the respondent in order to increase or decrease adhesion to the concept measured [6]. • Andrea/Giulia walks for one to two kilometers everyday without tiring, but he/she cannot run anymore due to an injured knee. Through the comparison between responses given to self-assessment questions and vignettes questions, it is possible to overcome incomparability between responses (given by different subjects to the same question) which affects HRQoL evaluations as well.
From a statistical standpoint, Greene and Hensher (2010) and Wand (2011) introduced and reviewed, respectively, the compound hierarchical ordinal probit (Chopit) model, which can be successfully used to analyze ordinal responses to a questionnaire including vignettes [8].
In literature, several instruments have been provided to measure the HRQoL. Among them, the EQ-5D-5L questionnaire is a standardized instrument applicable to a wide range of health conditions and treatments. EQ-5D-5L provides a simple descriptive profile and a single index value for health status consisting in five dimensions [9]: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. The electronic version of these questionnaires have been often used recently since they improve the survey's effectiveness and efficiency by reducing probability of error in data entry process [10].
Other authors use anchoring vignettes in EQ-5D-5L questionnaire in paper format to correct interpersonal incomparability in HRQoL evaluation [11].

Instruments of Data Collection
Patients completed two different types of questionnaire on quality of life assessment: the EQ-5D-5L questionnaire in traditional and vignettes form, using paper format or touch screen tool, and the SF-12 questionnaire in paper form. Moreover, additional information was collected on age, gender, job, educational level, presence of cardiovascular disease or predisposition, pharmacological therapy and Implantable Cardioverter Defibrillator (ICD) treatment ( Fig. 1).

EQ-5D-5L Questionnaire
Developed by the EuroQol Group, the EQ-5D-5L is the 5-level version of EQ-5D. The questionnaire comprises the same 5 dimensions of previous EQ-5D version (mobility, self-care, usual activities, pain/discomfort, anxiety/ depression), with 5 levels for each dimension: no problems, slight problems, moderate problems, severe problems and extreme problems [12]. EQ-5D has been developed as a simple generic measure to evaluate quality of life [13], and has been validated in several studies on cardiovascular diseases [14 -16]. In our study, the same questionnaire has been administered in standard format and in a newest version with anchoring vignettes. Vignettes have been created from the translation of questionnaire proposed by Au and Lorgelly [11].

Statistical Analysis
Correlation measures are computed (on the conventional and polycoric correlation matrix) on standardized sum of scores for each dimension of EQ-5D and of SF-12 questionnaire in order to assess the concordance. A factorial analysis on polycoric correlation matrix has been performed to assess the coherence [16] of EQ-5D in vignettes format with SF-12 without vignettes. Results are typically interpreted in terms of the major loadings on each factor and represented either as a table of loadings or graphically, with all the loadings with the absolute value greater than 1 as shown.
A Chopit model has been estimated for the components that may affect the item evaluation process, with the questionnaire form (paper or touch screen) included as an explanatory covariate [18].
The standard parametric analysis presented in this paper is based on the Chopit model [18]. It consists of two sets of response variables, one for self-assessment, and one related to the answers that the interviewee gives to the vignettes. The main difference between the Chopit and probit model lies in the thresholds of the continuous latent variable that defines the response process. Such thresholds are fixed in probit models and varying in Chopit model, depending on the individual characteristics, thus indicating the presence of covariates that affect the subjective evaluation [19].
By using only the component related to self-evaluation, it is not possible to separate the parameters (β) related to self-evaluation component from those (γ) defined on the cut point of latent variable (see Appendix). For this reason, it is important to use the information provided by the vignettes for modeling also this component.
Results of Chopit model, performed on vignettes questionnaire, has been compared with conventional ordinal Probit model performed on data without vignettes. Table 1 reports the correlation coefficients between the dimensions included in each questionnaire. These measures of convergent validity between EQ-5D (with vignettes) and SF-12 questionnaire, indicate that anxiety is related with all SF-12 dimensions excluding general mental health (MH), and that physical function (PF) is associated with all EQ-5D dimensions. Table 1. Correlation coefficient and 95% CI between EQ-5D and SF-12 dimensions. Five latent factors whose eigenvalues are greater than 1 are identified (Fig. 2). Observing the loadings of variables on latent factors (Fig. 3) it is possible to assess that, on the first factor, the variables related to physical activity in EQ-5D and SF-12 questionnaire are coherently relevant. The variables with greater loadings on the second factor are related to emotional health and self-care status in both questionnaires. Mobility and selfcare variables are relevant for the third dimension, while on fourth factor are relevant activity and pain component, and the variable more contributing to last factor is the anxiety component. The identified dimensions are coherent with the different aspects of HRQoL, which the questionnaire aims to capture. A comparison between EQ-5D questionnaire in paper and touch-screen format has been performed evaluating if the questionnaire form affects the subjective evaluation parameters modeled in a Chopit model. The Chopit model has been estimated on each questionnaire dimension.

RESULTS
The questionnaire form is not a significant factor affecting the subjective evaluation, as modeled by γ parameters (Tables 2-5). For each computed model, the questionnaire form, in paper and touch screen format, does not affect the subjective interpretation of the ordinal scale proposed. This result comes from the lack of statistical significance of estimates on threshold defined on the latent variable; the subjective choice is modeled by a normal latent variable on which are defined covariates that may affect subjective evaluation process.
Moreover, the significant estimates (β) provided by a Chopit model, including vignettes, are also significant and coherent, in terms of estimated effect, compared to conventional proportional odds model (Tables 2-5), performed on EQ-5D questionnaire without vignettes. Considering the Chopit model and ordinal Probit model, it seems that the main factor influencing each dimension of the subjective quality of life is represented by the educational attainment. As shown in Tables 2 and 3, it is statistically significant for mobility and activity dimension (P-value<0.05), indicating lower propensity in people with higher education level to perceive themselves as subjects with problems in mobility (OR=0.33) and activity (OR=0.44). Gender is related to the perceived quality of life (P-value<0.05), in particular for the anxiety dimension ( Table 5), denoting a greater tendency for females to perceive a poorer quality of life (OR=2.79). Table 4, gender and ICD implantation are associated with subjective pain perception; specifically, women perceive more pain and discomfort (OR=2.58). Moreover, ICD implantation is associated with a higher propensity to pain perception (OR=5.74).  The model has not been estimated for self-care dimension because the majority of individuals in the sample (33 subject) confirmed not to have any problems in washing or dressing themselves, indicating that almost all patients are autonomous in the management of their daily chores.

DISCUSSION
The aim of the study is the assessment of the coherence between EQ-5D with SF-12 paper questionnaire, a validated tool useful to measure HRQoL outcome. The assessment of convergent validity between EQ-5D and SF-12 indicates that comparable dimensions are more related (i.e. Activity versus Physical Function, or Pain versus Bodily Pain and Anxiety versus Vitality and Social Functioning). A similar correlation pattern has been reported in Pattanaphesaj et al. [20].
Considering the overall score on EQ-5D in vignettes and SF-12, it is possible to consider that there are five latent dimensions (Health and Physical activity, Emotional health, Selfcare, Pain and discomfort, and Anxiety) coherently with EQ-5D questionnaire structure. Also, other studies confirmed the convergent validity of both instruments to measure the HRQoL [21].
Once an overall coherence between different instruments used to measure the same HRQoL outcome has been established, another objective was to evaluate if the touch screen form of the vignettes questionnaire affect the item interpretation process.
EQ-5D vignettes questionnaire, in paper form, has been considered in literature as a validated instrument useful to take into account of heterogeneity in item interpretation process [11]. It has been shown that EQ-5D questionnaire may be subject to DIF [3] (Different Item Functioning), a different item evaluation process across heterogeneous subjects. If DIF is not considered, the conclusions about perceived HRQoL may be misleading especially when heterogeneous groups are analyzed [19].
Chopit model, performed on vignettes questionnaire, leads to consider covariates on Item evaluation process identifying variables that may affect the subjective question interpretation [7], for this reason, the questionnaire form, has been included in the model as explanatory covariate defined on variable's interpretation performed by respondent.
Concerning the validation of EQ-5D questionnaire with vignettes in touch screen format, the questionnaire form does not affect the interpretation of questions among subjects, confirming that the electronic format is equivalent with correspondent validated paper tool.
Moreover, as the administration of the electronic questionnaire is a widespread practice, many patient reported outcome studies confirm that paper and computer administered questionnaires are equivalent [22].
Considering also the patient's prospective, an important aspect in PCOR research, in some studies, patients preferred electronic surveys, especially when being assessed for psychological aspects [23,24]. A further point under consideration is the evaluation of coherence, in terms of results estimation between a Chopit model performed including vignettes in the data, and the results obtained estimating a conventional Ordinal probit model.
The results are very similar in term of effects related to factors affecting the HRQoL in both estimated models. Consistently with existing literature [25], it is possible to confirm that subjects with higher education attainment level achieve lower scores on EQ-5D scale in mobility, activity and anxiety dimensions, thus indicating a better quality of life in these domains.
Considering pain dimension, the factors affecting the HRQoL seem to be ICD implantation and gender. Other studies confirmed that the disease severity is associated with the subjective pain perception: in fact, the presence of previous chronic disease was associated with higher pain and discomfort perception [26,27]. As shown in other researches, there is a significant relation between gender and anxiety/depression dimension in the EQ-5D questionnaire [21].
However, we acknowledge that further research is needed to generalize the validity of EQ-5D vignettes questionnaire in electronic form, given that we have built our conclusions on the small study sample. In fact, 90% of articles reporting validation of patient centered outcomes had a sample size greater than or equal to 100 [28].
Moreover, when an abstract concept such as HRQoL is considered, it appears pretty obvious that it is necessary to make most of the vignettes, but in our case the questionnaire has been validated using a sample of fairly homogeneous patients. Vignettes questionnaire is useful to correct DIF especially in case of rising heterogeneity of individuals in terms of socio-cultural characteristics [29]. A further research development may be the validation of EQ-5D questionnaire using a larger sample of patients with more heterogeneous features in terms of health, social background and culture.

CONCLUSION
The EQ-5D in vignettes in electronic format is a tool to measure HRQoL, which seems as valid as other validated questionnaires used to measure the same concept as SF-12 questionnaire.
Moreover, the questionnaire electronic form, seems to be a factor not affecting the subjective item evaluation process.

ETHICS APPROVAL AND CONSENT TO PARTICIPATE
Not applicable.

HUMAN AND ANIMAL RIGHTS
No Animals/Humans were used for studies that are base of this research.

CONSENT FOR PUBLICATION
Not applicable.
Chopit model (Compound Hierarchical Ordered Probit) is an extension of the classical ordinal probit to the case where a dataset also includes vignettes. The methodology consists of two sets of response variables, one for selfassessment, and one relating to the answers that the interviewee gives to the vignettes.
In this context, Y i represents the response of the i-th subject given to self-assessment (for i = 1,…,n), and Z lj is the individual score (for l = 1,…,L) attributed to the j-th vignette (for j = 1,…,2J+1), taking into account that both the vignettes that self-assessments, can assume K integer values non-negative.
The model includes a self-assessment component and a component related to the vignettes.

Self-Evaluation Component
Assume that the actual response provided by the respondent i-th to the self-evaluation question is a latent onedimensional random variable µ i , which is a linear function of a vector of parameters β and explanatory covariates X i which add up to a random effect η i : The parameters indicated with β, as in a classic ordinal probit model, define the effect of the covariates on the concept under investigation measured on an ordinal scale.
The level perceived by the subject i-th respect to µ i , is normally distributed and is however a random component unobservable Y i *: The observation mechanism that allows to move from perceived level to the one observed, divides the latent variable Y i * by means of the thresholds, as for the general ordinal probit model.
The thresholds in this case, vary depending on the respondent for which the threshold levels are no longer constant but vary between subjects and are linear function of a vector of explanatory covariates Vi by γ parameters which define the effect of the subject characteristic on the question interpretation: Using only the component related to self-evaluation, it is not possible to separate the parameters β and γ, for that reason, we must use the information provided by the vignettes modeling also this component.

Vignettes Component
The model assumes that the respondent perceives the level of assessment on the vignette j-th, presented by the researcher, through a mechanism of observation defined on a latent random variable Z* ij such that: You can see that θ, which represents the level perceived for the vignettes, only varies depending on the j-th vignette and not according to the individual i-th, as is assumed in the model that each vignette is interpreted in the same way by the various respondents (vignettes equivalence).
Another assumption is that among individuals can vary the interpretation of the question pertaining to selfevaluation, but, each individual, in the same way, uses the measurement scale for answer to the the question pertaining to self-evaluation, both to evaluate the fictitious situations presented in the vignettes. This property is defined as response consistency.
The mechanism of discretization of continuous latent variable Z* ij is the same, whereas the number of values that can assume the score of a vignette j-th are K as for the question relative to self-evaluation: The thresholds are determined by the same linear combination expected in the case of modeling on self-assessment, or through the γ coefficients and explanatory covariates Vi: The cut points of the latent random variable are the same Z*ij and Yi*, and, unlike the ordinal probit model these vary according to the individual characteristics.