ISO 16140-2 Microbiology of food and animal feeding stuffs.pdf

EN ISO 16140-2:2011 (E) Secretariat: Nederlands Normalisatie-instituut (NEN) Vlinderweg 6 P.O. box 5059 2623 AX Delf

Views 136 Downloads 4 File size 1MB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

EN ISO 16140-2:2011 (E)

Secretariat:

Nederlands Normalisatie-instituut (NEN) Vlinderweg 6

P.O. box 5059

2623 AX Delft, NL Netherlands

2600 GB Delft, NL

telephone: fax: E-mail:

+31 15 2690 112 +31 15 2 690 201 [email protected]

Subject:

Action From

doc.nr. ISO/TC 34/SC 9/WG 3

N 099

Date 2011-03-14

total pages 64

item nr.

Supersedes document

Working Group

ISO/TC 34/SC 9/WG 3 Method Validation

Netherlands

Draft EN ISO / CD 16140-2: Microbiology of food and animal feeding stuffs - Method validation - Part 2: Protocol for the validation of alternative (proprietary) methods against a reference method, vs. 11-03-2011

For information ISO/TC 34/SC 9 WG 3

This document has been developed by ISO/TC 34/SC 9 WG 3 / Project group 2 – Proprietary methods. The members who were involved: The experts involved were: - Paul in 't Veld, Food & Consumer Product Safety Authority, NL - Max Feinberg, INRA, France - Phil Feldsine, BioControl, representing AOAC, USA - Danièle Sohier, ADRIA, representing AFNOR Certification, France - Roy Betts, Campden BRI, representing MicroVal, UK - Gro Johannessen, Veterinary Institute, representing NMKL/NordVal, Sweden - Nacour Haout, IZSUM, Italy The previous draft EN ISO 16140-2, vs. 28-01-2011, was submitted to the WG 3 members for comments at the end of January 2011. Based on the comments received the PG 2 project leader Paul in 't Veld had adjusted that draft. This document takes into account the comments and advices received from the members of WG 3/PG 2 and WG 3. As decided at the WG 3 meeting this new draft is submitted for CD vote to the SC 9 secretariat in order to be able to discuss the outcome of the CD vote at the SC9 meeting in June 2011 in Bournemouth, UK.

1 Document stage: Document language: E L:\009\ISO\Werkgroep 3\DOC's\N nummers\N051 -\N099 - Draft CD EN ISO 16140-2 (revised).doc STD Version 2.0

EN ISO 16140-2:2011 (E) + CEN TC 275

2 Document stage: Document language: E L:\009\ISO\Werkgroep 3\DOC's\N nummers\N051 -\N099 - Draft CD EN ISO 16140-2 (revised).doc STD Version 2.0

EN ISO 16140-2:2011 (E) CEN TC 275 secretariat: DIN

Microbiology of food and animal feeding stuffs — Protocol for the validation of alternative method Part 2: Protocol for the validation of alternative (proprietary) methods against a reference method.

Microbiologie des aliments — Protocole de validation des méthodes alternatives

ICS: Descriptors:

1 Document stage: Document language: E L:\009\ISO\Werkgroep 3\DOC's\N nummers\N051 -\N099 - Draft CD EN ISO 16140-2 (revised).doc STD Version 2.0

EN ISO 16140:2008 (E)

Contents page 1

Scope .............................................................................................................................................................. 5

2

Normative references .................................................................................................................................... 5

3

Terms and definitions ................................................................................................................................... 5

4

General principles for the validation and the certification of alternative methods ................................ 5

5

Qualitative methods - Technical protocol for their validation .................................................................. 6

6

Quantitative methods - Technical protocol for their validation .............................................................. 19

Annex A (normative) Specific rules for the acceptance of results already obtained in a prior validation scheme.......................................................................................................................................................... 30 Annex B (informative) Classification of sample types for validation studies .................................................... 31 Annex C (normative) Order of preference for use of naturally and artificially contaminated samples in validation studies. ....................................................................................................................................... 36 Annex D (informative) General Protocols for Contamination by Mixture and Artificially Contaminating Food Matrices. ............................................................................................................................................. 37 Annex E (informative) Example for presenting results of accuracy study of the method comparisons study for qualitative methods. ................................................................................................................... 40 Annex F (normative) Points to be considered when selecting strains for testing selectivity ......................... 42 Annex G (normative) Test applied to the examination of discordant results. .................................................... 44 Annex H (informative) Example calculation of RLOD in a method comparison study and an interlaboratory study. .................................................................................................................................. 45 Annex I (informative) Calculations for the comparison of the Relative Limit Of Detection (RLOD) between laboratories as obtained in an interlaboratory study. .............................................................. 50 Annex J (informative) Principle of the accuracy profile for validation of quantitative methods. ..................... 51 Annex K (informative) Application of the accuracy profile in the method comparison study. ......................... 53 Annex L (informative) Application of accuracy profile to the validation of an alternative method to data from a collaborative study. ......................................................................................................................... 56 Bibliography .............................................................................................................................................................. 62

2

EN ISO 16140:2008 (E)

Foreword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare International Standards. Draft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. In other circumstances, particularly when there is an urgent market requirement for such documents, a technical committee may decide to publish other types of normative document: —

an ISO Publicly Available Specification (ISO/PAS) represents an agreement between technical experts in an ISO working group and is accepted for publication if it is approved by more than 50 % of the members of the parent committee casting a vote;



an ISO Technical Specification (ISO/TS) represents an agreement between the members of a technical committee and is accepted for publication if it is approved by 2/3 of the members of the committee casting a vote.

An ISO/PAS or ISO/TS is reviewed after three years in order to decide whether it will be confirmed for a further three years, revised to become an International Standard, or withdrawn. If the ISO/PAS or ISO/TS is confirmed, it is reviewed again after a further three years, at which time it shall either be transformed into an International Standard or be withdrawn. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. EN ISO 16140-2 was prepared by Technical Committee ISO/TC 34, Food products, Subcommittee SC 9, Microbiology in collaboration with Technical Committee CEN/TC 275 Food analysis - Horizontal methods, Working group 6, Contaminants. This second edition of EN ISO 16140, cancels and replaces ISO 16140:2002, which has been technically revised. EN ISO 16140 consists of the following parts, under the general title Microbiology of food and animal feeding stuffs — Protocols for the validation of microbiological methods.: — Part 1: Definitions —

Part 2: Protocol for the validation of alternative methods against a reference method

3

EN ISO 16140:2008 (E)

Introduction The need for the food industry to rapidly assess the microbiological quality of raw materials and finished products and the microbiological status of manufacturing procedures, has led to the development and refinement of alternative microbiological methods of analysis that are quicker and/or easier to perform than the corresponding reference method; some can also be automated. The suppliers/producers of the alternative methods, the food and drink industry, the public health services and other authorities need a reliable common protocol for the validation of such alternative methods. The data generated can also be the basis for the certification of a method by an independent organisation. This part of the EN ISO 16140 is intended to provide a specific protocol and guidelines for the validation of proprietary methods intended to be used as a rapid and/or easier to perform method than the corresponding reference method. In addition this part of the EN ISO 16140 can also be used for the validation of other, non proprietary, methods that are used instead of the reference method. The use of this standard requires expertise on relevant areas such as microbiology, statistical design and analysis as indicated in the respective sections. The statistical expertise encompasses: overview of sampling theory and design of experiments, statistical analysis of microbiological data (from colony counts or presence/absence tests) and overview of statistical concepts on random sampling, sample heterogeneity, sample stability, design of experiments, variance components. When this part of ISO 16140 is next reviewed, account will be taken of all information then available regarding the extent to which the guidelines have been followed and the reasons for deviation from them in the case of particular products. The harmonization of validation methods cannot be immediate, and for certain groups of products International Standards and/or national standards may already exist that do not comply with this horizontal method. It is hoped that when such standards are reviewed they will be changed to comply with this part of ISO 16140 so that eventually the only remaining departures from this horizontal method will be those necessary for well-established technical reasons. For example the IDF standard 161A (Determination of bacteriological quality – Guidance on evaluation of routine methods) deals with a very specific validation for a very specific subject (the hygienic status of raw milk samples) and will remain as a vertical standard besides the ISO 16140-2. In case such a validation is needed the IDF 161A [1] standard is leading.

4

EN ISO 16140:2008 (E) Scope This part of EN ISO 16140 specifies the general principle and the technical protocol for the validation of alternative, mostly proprietary, methods in the field of microbiological analysis of food, animal feeding stuff and environmental and primary production stage samples for: the validation of alternative (proprietary) methods the international acceptance of the results obtained by the alternative (proprietary) method. It also establishes the general principles of certification of these proprietary methods, based on the validation protocol defined in this EN ISO 16140-2 (see 4.2).

Normative references The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. EN ISO 16140-1, Microbiology of food and animal feeding stuffs- Protocol for the validation of alternative methods. Part 1: Terminology.

Terms and definitions The terms and definitions that are relevant for the purpose of this standard are presented in EN ISO 16140-1: Terminology.

General principles for the validation and the certification of alternative methods Validation protocol The validation protocol comprises two phases: 

a methods comparison study of the alternative (proprietary) method against the reference method (see definition in ISO 16140-1) carried out in the organizing laboratory;



an interlaboratory study of the alternative (proprietary) method against the reference method carried out in different laboratories.

The technical rules for performing the methods comparison study and the interlaboratory study are given in clauses 5 and 6, depending upon whether the alternative (proprietary) method is qualitative or quantitative in nature.

Principles of the certification If certification is required, then details of the organisation of certification should be provided by the certification body involved. The two following principles should be applied if certification is done: 

The manufacturer shall apply a quality management system covering the production line of the product for which the certification is sought and based on the appropriate International Standard relative to quality systems or other equivalent international standard (for example EN ISO 9001).In granting the certification, the certification organisation shall take into account the existence of any quality system certificate issued by a certification body accredited for quality systems.



A regular verification of the quality of the certified method shall be undertaken after the certification is granted. An audit is to be performed regularly to verify that the following are still met: - the quality assurance requirements; - the product's production control requirements.

5

EN ISO 16140:2008 (E) In addition to the general requirements of the appropriate International Standard related to the quality system, the manufacturer should submit updated documentation to the certification body if any changes or modifications are made to the product or production process which could affect the instructions for using the method and/or the method‘s performance. The certification organisation then decides whether these modifications affect the certification. If the alternative (proprietary) method has already been validated according to the validation protocols of another organisation and meets the requirements set by another organisation, specific rules are defined in annex A for accepting the results of this prior validation by another certification body.

Qualitative methods - Technical protocol for their validation Methods comparison study The Methods Comparison Study is the part of the validation process that is performed in the organizing laboratory. It consists of three parts: 

a comparative study of the results of the reference method to the results of the alternative method in (naturally and/or artificially) contaminated samples (so called relative accuracy study);



a comparative study to determine the relative limit of detection (RLOD) in artificially contaminated samples (so called RLOD study);



an inclusivity/exclusivity study of the alternative method.

The reference and alternative methods shall be performed with, as far as possible, exactly the same sample (same test portion). However a distinction is made between studies where the same test portion can be used for both the reference and the alternative method due to both methods having exactly the same first step in the enrichment procedure. As a consequence the results from both methods are related to each other. For example, when the sample is not contaminated, both methods should find the result of that sample negative. Due to this relationship the data produced by the reference and the alternative method are paired or matched. In this standard the wording ―paired study‖ will be used for this type of study. The opposite situation, where there is no common first enrichment step for both the reference and the alternative method is also possible. In this case different test portions have to be used for each method and the resulting data are unpaired or unmatched. In this standard the word ―unpaired study‖ will be used for this type of study. The choice of having a paired study or an unpaired study depends on the protocols of the reference and alternative method. In case there is a common first step in the enrichment procedures a paired study design is mandatory. A positive result obtained with the alternative method shall, in general, be confirmed in order to determine whether the result is a true positive or false positive result. This confirmation is only needed for the accuracy study part. All positive results obtained with the alternative method in an unpaired study shall be confirmed. In a paired study only the positive results obtained with the alternative method for which the corresponding result with the reference method was negative shall be confirmed. The confirmation procedure used is in most cases based on the reference method procedure, however this is not mandatory. Method comparison study for paired data. This section describes the method comparison study in case the reference and alternative method have a joint first step in the enrichment procedures (paired study) . Determination of the Relative Accuracy The relative accuracy study is a comparative study between the results obtained by the reference method and the results of the alternative method. This study is conducted using naturally and/or artificially contaminated samples. Different food categories and types will be tested for this. Selection of food and other categories to be used The selection of (food) categories and types used within the validation will depend on the type or group of microorganism and the scope of the validation.

6

EN ISO 16140:2008 (E) If the method is to be applied for "all foods" then five categories of food should be studied. The validation study report will state the food categories used in the study. If the method is to be validated for a restricted number of food categories, e.g. 'meat products' and 'milk and dairy products' then only these categories require study. In addition to food categories, feed samples, environmental samples and primary production stage samples could be included as additional categories. This will broaden the application of the use of the alternative method for these additional categories. For all selected categories (food and others) three different (food) types per category shall be included in the study. Annex B presents an overview of the relevant (food) types and (food) categories per type of micro-organism, that might be relevant for the validation. When selecting samples for the study it is of the highest priority to find those that are naturally contaminated. If it is not possible to acquire a sufficient number of naturally contaminated samples, artificial contamination of samples is permissible (see annex C and D). It is desirable that food samples come from as wide a distribution as possible in order to reduce any bias from local food specialities and broaden the range of validation. It shall be ensured that with the selection of the different (food) types both high and low (natural) background microflora, different types of stresses due to processing and raw (unprocessed) (food) items are being included in the study. An example could be for the validation of a method for detection of Listeria monocytogenes: 

For the food category milk(products), the food types (1) raw unprocessed milk (high background flora, unstressed), (2) dried milk products (low background flora, desiccation stress) and (3) pasteurised milk products (low background flora, heat stress).



For the food category meat(products), the food types (1) cooked meat products (lower background flora, heat stress), (2) fermented meat products (high background flora, pH stress) and (3) raw cured meat products (intermediate background flora, aw stress).

Note: Annex B should be used to facilitate the selection of food types and items for the type of micro-organism involved. It should not be regarded as a mandatory choice. Number of samples, test sample preparation For each category being examined, 3 types within that category should be used. For each type 20 samples representative for this (food) type should be tested, this results in 60 samples per category being tested. Fractional positive results by either the reference or alternative method (i.e. all samples should not be positive or all negative) should be obtained for each type tested. In the ideal situation 10 samples (50 %) tested per type should be positive and 10 negative. Some naturally contaminated samples may contain a high number of target analyte. In such cases naturally contaminated sample can be ‗diluted‘ with uncontaminated material of the same (food) item to achieve a lower level of contamination. Calculation and interpretation for Relative Accuracy with paired data In general the data shall be presented in a report in order to have an overview of the raw data obtained and information shall be given on the type of contamination (naturally contaminated or artificially contaminated) of the samples used. For artificially contaminated samples the (reference to the) procedure used for preparation shall be specified (see also Annex C). Tabulate the data obtained from the results of the reference and alternative methods and calculate the following parameters for each (food) category (60 samples) and type (20 samples) according to the Table 1. Table 1 – Comparison of results between the reference and alternative method for paired data Responses

Reference method positive (R+)

Reference method negative (R-)

Alternative method positive (A+)

+/+ positive agreement (PA)

-/+ positive deviation (PD) 1 (confirmed (unconfirmed))

7

EN ISO 16140:2008 (E) Alternative method negative (A-) 1

+/- negative deviation (ND)

-/- negative agreement (NA) 1 (confirmed (unconfirmed))

= confirmed and unconfirmed results of the alternative method

Each table shall combine the unconfirmed and confirmed results from the alternative method in relation to the result of the reference method. Confirmation of the positive result obtained with the alternative method is only needed in case the result from the same sample obtained with the reference method is negative. An example of such a Table is presented in Annex E. Two cases can be distinguished when there is a difference between the unconfirmed and confirmed results of the alternative method. The first is that a result of a sample obtained with the alternative method is confirmed as being positive when the reference method gives a negative result for the same sample. The result after confirmation is then regarded as a true positive result, which indicates that the alternative method performance is better than the performance of the reference method. The other case is when a positive unconfirmed result of the alternative method is negative after confirmation, this result becomes then a negative agreement result instead of a positive deviating result. This means that the alternative method gives false positive results. Calculate based on the results from Table 1, using the unconfirmed results from the alternative method for each (food) category and per (food) type the values for relative accuracy, specificity and sensitivity as follows:

(PA + NA)



Relative accuracy:



Relative specificity:

SP =

NA × 100 % ; N-



Relative sensitivity:

SE =

PA × 100% N+

AC =

N

× 100 % ;

Where: N is the total number of samples (NA + PA + PD +ND); N- is the total number of negative results with the reference method (NA + PD); N+ is the total number of positive results with the reference method (PA + ND). Calculate based on the results from Table 1, using the confirmed results from the alternative method for each (food) category and per (food) type the values for relative accuracy, specificity and sensitivity as above. In case there is a difference in PD and NA between unconfirmed and confirmed counts, the sensitivity of the alternative and reference method is calculated separately as stated below in addition; For the alternative method:

For the reference method:

SE alt  SE ref 

PA  PD  100% (PA  ND  PD)

PA  ND  100% (PA  ND  PD)

The results obtained can be summarised as presented in Table 2. Table 2 - Overview of the Relative Accuracy study data per category for a paired study Category

P A

NA

C

1

U

ND

2

PD

C

1

U

Su m (N) 2

Relative accuracy AC (%) C

1

U

2

N+

Relative sensitivit y SE (%)

Relative sensitivity ref. 3 method SEref (%)

Relative sensitivity alt. 3 method SEalt (%)

N-

Relative specificit y SP (%) C

1

U

2

8

EN ISO 16140:2008 (E) Category 1: All types - type 1 - type 2 - type 3 Category 2: All types - type 1 - type 2 - type 3 Category x: All types - type 1 - type 2 - type 3 1

C = confirmed results of the alternative method U = unconfirmed results of the alternative method 3 : is only needed in case the number of confirmed and unconfirmed results for the alternative method are not the same. 2

Examine the discordant results as described in annex F, by using the count of PD and ND. The test on discordant results is used for both unconfirmed and confirmed results of the alternative method. When the values for PD and ND are high and almost equal, no statistical difference between the methods can be detected using the McNemar test. In this case, the organising laboratory shall pay further attention to explain the reasons for the high values of PD and ND. Moreover, it shows that the relative accuracy of a method shall never be interpreted by taking into account only the McNemar test. Determination of the Relative Detection Limit (RLOD) The Relative Limit of Detection study is a comparative study between the reference and alternative method intended to compare the limit of detection of both methods. The study is conducted using replicates of artificially contaminated samples at, at least, 3 known levels of contamination. Different food categories and types will be tested for this as for the accuracy study. Positive deviating results obtained with the alternative method should be confirmed. Selection of (food) categories and number of samples and replicates tested. For the selection of (food) categories and (food) types see 5.1.1.1.1. The same categories will be used as selected for the accuracy study (5.1.1.1). For each category one (food) type is chosen, the (food) type shall be different to the (food) type selected for that category in the accuracy study (if possible) in order to have a broader representation of the (food) category evaluated. The samples should be artificially inoculated. Procedures for the preparation of artificially inoculated samples is presented in Annex D. Each (food) type will be inoculated with a different strain. A minimum of three levels per (food) type will be prepared. The first level shall be the negative control. Ideally the second level shall be the theoretical detection level and the third level just above the theoretical detection level. At least one of these levels should have fractional recovery by both methods. At the negative control level at least 5 replicate samples should be tested by both methods, for the second level (theoretical detection level) at least 20 and for the third level at least 5 replicates samples should be tested by both methods. Except for the negative control level used an estimate for the level of contamination should be made, e.g. MPN or other counting techniques can be used to do this. Note: In order to have a better assurance that fractional recovery will be obtained more levels of contamination can be produced and tested.

9

EN ISO 16140:2008 (E) Calculation and interpretation of the relative LOD (RLOD) for studies with paired data. The Relative Limit of Detection (RLOD) is calculated as the LOD of the alternative (alt) method divided by the LOD of the reference (ref) method. For each food category LODs and RLODs can be estimated by fitting an appropriate complementary-log-log (CLL) model to the combined absence/presence data of both methods as a function of method and contamination level. This will result in sigmoid curves in a graph of fraction positive versus log dose. For each category a model has to be fitted to the combined data of both methods. Two situations should be distinguished: 1. If the sample contamination levels are quantified by independent techniques (e.g. MPN) then the CLL model will use these levels by having an offset variable equal to the logarithm of the product sample contamination level. The model will fit a parameter for the method difference. 2. If the sample contamination levels are not quantified then the CLL model will have an offset variable equal to the logarithm of the inoculum size. The model will fit parameters for sample contamination levels and a parameter for the method difference. RLOD should be calculated for each category separately as for the difference between alternative and reference method

e D , where D is the estimated regression parameter

A statistical test should be used to investigate whether the RLOD is significantly different (at the 95 % confidence level) between categories (test for interaction between method and category). The test can be performed by fitting a CLL model to the combined data of both methods and all categories. In comparison to the models specified above also parameters for category differences and for the interaction between method and category have to be D

estimated. If there is no significant interaction then a combined RLOD is calculated as e , where D is the estimated regression parameter for the difference between alternative and reference method obtained by fitting a CLL model to the combined data (all categories tested). This model should not contain the interaction terms, and parameters for category differences only when the main effect category has been found significant. In all cases an approximate 90 % confidence interval [RLOD low , RLODupp] for the resulting RLOD should be



D t ( 0.95 )se(D )

D t ( 0.95 )se(D )



df calculated, for example as e , where se(D) is the approximate standard error of the , e df estimate D, df is the residual degrees of freedom of the fitted model, and tdf(0.95) is the critical value from a Student t distribution with df degrees of freedom at the one-sided 95 % confidence level.

Annex G gives an example for the calculation of the RLOD. An Excel spreadsheet for calculating LOD values is available via the link: http://www.wiwiss.fu-berlin.de/institute/iso/mitarbeiter/wilrich/PODLOD_ver3.xls. More information on the calculation of LOD can be found in [2]. An acceptability limit (AL) for the RLOD specifies the maximum increase in LOD of the alternative versus the reference method that would not be considered as relevant in consideration of the fitness for purpose of the method. Consequently AL will be a value >1. The upper 95 % confidence limit calculated for RLOD (RLOD upp) should be below AL. The interpretation should be made for each (food) category separately, taking into account the (food) types tested within the category. The AL for paired study data is set at 2, meaning that the LOD for the alternative method might not be higher than 2 times the LOD of the reference method. A LOD value for the alternative method smaller than the LOD value for the reference method is always accepted as this means that the alternative method is likely to detect lower levels of contamination than than the reference method. Determination of inclusivity and exclusivity Selection and number of test strains A range of strains should be used. Criteria for selecting test strains are given in annex E. The strains used should take into account the measurement principle of the alternative method (e.g. culture based, immunoassay based, molecular). Different measurement principles may require the use of different test panels of strains.

10

EN ISO 16140:2008 (E) Each strain used should be characterised biochemically and/or serologically and/or genetically in sufficient detail for its identity to be known. Strains used should preferentially have been isolated from foods, feeds, the food processing environment, or primary production taking into account the scope of the validation. However, clinical, environmental and culture collection strains can also be used. The original source of all isolates should be known and they should be held in a local (e.g. expert laboratory), national or international culture collection to enable them to be used in future testing if required. For inclusivity testing at least 50 pure cultures of microorganisms should be tested. For testing the inclusivity for Salmonella methods at least 100 pure cultures of different serotypes of Salmonella should be tested. For exclusivity testing at least 30 pure cultures of microorganisms should be tested. Inoculation of target strains (inclusivity) Each test is performed once and only with the alternative method. Inoculation of a suitable growth medium is carried out with a dilution of a pure culture of each test strain. The inoculum level should be 10 times to 100 times greater than the minimum detection level of the alternative method being validated and the protocol of the alternative method shall be used, including all enrichments detailed in the instructions of the alternative method. If the alternative method includes more than one enrichment protocol (e.g. for different sample types) then each of these protocols should be used with the complete panel of test strains. When negative or doubtful results are obtained, the test should be repeated and the reference method included to check that the strain could be detected with the appropriate reference method. If results are negative, consideration could be given to repeat the test with the addition of a food matrix. Inoculation of non-target strains (exclusivity) Each test is performed once and only with the alternative method. Inoculation of a suitable growth medium is carried out with a dilution of a pure culture of each test strain. No food sample is added. The pure culture should be grown in a non-selective broth under optimal conditions of growth to provide high cell populations in stationary phase. If the method involves a growth in a selective medium before a detection step, then for the purposes of exclusivity testing the selective medium should be replaced by a non-selective medium. If the alternative method gives a positive or doubtful result, then the test should be repeated using the complete enrichment protocol recommended in the instructions of the alternative method, using selective enrichments if these are noted in the instructions. If the alternative method includes more than one type of enrichment (e.g. for different sample types) then each of these should be used with the complete panel of test strains. Additionally the reference method should be used to check that the strain could not be detected with the reference method. Expression and interpretation of the results Tabulate the results as in Table 3: Table 3 - Presentation of the results for the selectivity Micro-organisms

Alternative method Expected result

Actual result

Target strains (Inclusivity) 1 2 etc. Non target strains (Exclusivity) 1 2 etc.

The interpretation shall be done by the laboratory in charge of the methods comparison study. The report should state any anomalies from the expected results.

11

EN ISO 16140:2008 (E) Method comparison study for unpaired data. This section describes the method comparison study in case the reference and alternative method do not have a joint first step in the enrichment procedures (unpaired study) . Determination of the Relative Accuracy The relative accuracy study is a comparative study between the results obtained by the reference method and the results of the alternative method. This study is conducted using naturally and/or artificially contaminated samples. Different food categories and types will be tested for this. For the selection of food and other categories to be used see 5.1.1.1.1. For the number of samples and test sample preparation to be used see 5.1.1.1.2. Calculation and interpretation for Relative Accuracy with unpaired data In general the data shall be presented in a report in order to have an overview of the raw data obtained and information shall be given on the type of contamination (naturally contaminated or artificially contaminated) of the samples used. For artificially contaminated samples the (reference to the) procedure used for preparation shall be specified. Tabulate the data obtained from the results of the reference and alternative methods and calculate the following parameters for each (food) category (60 samples) and type (20 samples) according to the Table 4.

Table 4 –Comparison of results between the reference and alternative method for unpaired data

1

Responses

Reference method positive (R+)

Reference method negative (R-)

Alternative method positive (A+)

+/+ positive agreement (PA) 1 (confirmed (unconfirmed))

-/+ positive deviation (PD) 1 (confirmed (unconfirmed))

Alternative method negative (A-)

+/- negative deviation (ND) 1 (confirmed (unconfirmed))

-/- negative agreement (NA) 1 (confirmed (unconfirmed))

= confirmed and unconfirmed results of the alternative method

Each table shall combine the unconfirmed and confirmed results from the alternative method in relation to the result of the reference method. Confirmation of the positive result obtained with the alternative method is needed in all cases as there is no relationship between results obtained with the reference and the alternative method as for the paired studies. When the result of the alternative method is confirmed this can be regarded as a true positive result. When the result of the alternative method is not confirmed this can be regarded as a false positive result for the alternative method. Calculate based on the results from Table 4, using the unconfirmed results from the alternative method for each (food) category the values for relative accuracy, specificity and sensitivity as follows:

(PA + NA)



Relative accuracy:



Relative specificity:

SP =

NA × 100 % ; N-



Relative sensitivity:

SE =

PA × 100% N+

AC =

N

× 100 % ;

Where:

12

EN ISO 16140:2008 (E) N is the total number of samples (NA + PA + PD +ND); N- is the total number of negative results with the reference method (NA + PD); N+ is the total number of positive results with the reference method (PA + ND). Calculate based on the results from Table 4, using the confirmed results from the alternative method for each (food) category the values for relative accuracy, specificity and sensitivity as above. The results obtained can be summarised as presented in Table 5.

13

EN ISO 16140:2008 (E) Table 5 – Summary of results obtained with the reference and alternative method for each category and type for unpaired data. Confirmed results

Category P A

N A

N D

P Su Relati N+ Relati N- Relati P D m ve ve ve A (N) accura sensiti specifi cy vity city AC SE SP (%) (%) (%)

Unconfirmed results

N A

N D

P Su Relati N Relativ N Relati D m ve e ve + (N) accura sensiti specifi cy vity city AC SE (%) SP (%) (%)

Categ.: 1 All types - type 1 - type 2 - type 3 Categ.: 2 All types - type 1 - type 2 - type 3 Categ.: x All types - type 1 - type 2 - type 3

Examine the discordant results as described in annex F (the McNemar test), by using the count of PD and ND. When the values for PD and ND are high and almost equal, no statistical difference between the methods can be detected using the McNemar test. In this case, the organising laboratory shall pay further attention to explain the reasons for the high values of PD and ND. Moreover, it shows that the relative accuracy of a method shall never be interpreted by taking into account only the McNemar test. Note: The McNemar test is by principle intended for use with paired data. However it is also used in the situation of unpaired data because of lack of a suitable alternative at this moment. In this situation the effect of pairing is ignored. Determination of the Relative Detection Limit (RLOD) The Relative Limit of Detection study is a comparative study between the reference and alternative method intended to compare the limit of detection of both methods. The study is conducted using replicates of artificially contaminated samples at well known levels of contamination. Different food categories and types will be tested for this as for the accuracy study. Positive results obtained with the alternative method should be confirmed. For the selection of (food) categories and number of samples and replicates tested see 5.1.1.2.1 The calculation and of the Relative Limit of Detection (RLOD) for studies with unpaired data is the same as for studies with paired data. For this see 5.1.1.2.2. The AL for unpaired study data is set at 3, meaning that the LOD for the alternative method might not be higher than 3 times the LOD of the reference method. A LOD value for the alternative method smaller than the LOD value

14

EN ISO 16140:2008 (E) for the reference method is always accepted as this means that the alternative method is likely to be more sensitive than the reference method. Determination of inclusivity and exclusivity For the determination of inclusivity and exclusivity see 5.1.1.3, these tests are identical for paired and unpaired studies.

Interlaboratory study The aim of the collaborative study is to determine the variability of the results obtained by different collaborators using identical samples (reproducibility conditions). Wherever possible the study conditions should reflect the normal variation between laboratories. The distinction between paired and unpaired studies is indicated in the text, however no separate sections are given as the effect for the measurement protocol and analysis of data is limited. The interlaboratory study is organised by the expert laboratory. Measurement protocol The interlaboratory study shall produce 10 valid data from at least 10 collaborators. A collaborator is defined as an individual laboratory technician, who works completely independently from other collaborators, using different sets of samples. Normally these 10 collaborators originate from 10 different laboratories including data set(s) produced by the organising laboratory; however different collaborators may come from the same laboratory if necessary. In these cases the collaborators shall come from a minimum of five different organisations (note: laboratories in different locations but belonging to one company or institute, are accepted as different organisations). Technicians involved in the preparation of the samples used in the collaborative study, shall not take part in the testing of those samples within the collaborative study. The protocol is the following: 

A relevant food item (for selection see annex B) is used to prepare the test samples. The food should contain a natural background microflora. In cases where different enrichment protocols for the alternative method exist, a challenging enrichment protocol should be selected, e.g. the protocol having the shortest incubation time or the most selective enrichment conditions. The selected food item shall be relevant for the chosen enrichment protocol.



The food should be inoculated with the target organism. The protocol for inoculation of the food samples shall be appropriate for the selected food. Samples shall be prepared by the organising laboratory to ensure homogeneity between samples using matrix preparation protocols contained in Annex C and D.



At least three different levels of contamination per food shall be used: a negative control (L0) and two levels (L1 and L2) at least one of which shall produce fractional positive.



At least eight blind replicates at each level of contamination are analysed by each collaborator by both methods. So in total a minimum of 48 results (8 replicates x 3 levels x 2 methods) per collaborator.



For tests which give paired results (paired result occurs when the primary enrichment is the same for the alternative and reference method) only one sample is required, this sample is analyzed by the alternative and the reference method. For tests which give unpaired results (unpaired result occurs when the alternative and reference methods start from different primary enrichments) different samples are required, one sample is analyzed by the protocol of the alternative method and another by the protocol of the reference method.



The data should be reported in two tables, the first giving the unconfirmed results from alternative method and the confirmed result from the reference method. The second giving the confirmed results from both methods. If the results for alternative and reference methods have been obtained from the same initial enrichment broth (paired data), then the confirmation of the reference method also confirms the alternative method. In cases when the reference method gives a negative result and the alternative method gives a positive result, then confirmation of the positive result is required. If the results for alternative and reference methods have been obtained from different enrichments (unpaired data), then all enrichments obtained with the alternative method should be taken forward for confirmation.



The organising laboratory can indicate that broths, plates and/or isolates shall be retained for a certain period of time to be able to confirm results obtained by a collaborator, if needed.

15

EN ISO 16140:2008 (E) 

The analysis of samples shall be performed by each collaborator at the stipulated date;



In either case, the combination "number of levels of contamination/number of replicates/number of non-outlier collaborators" shall be selected so that at least 480 results (240 by each method) are generated for use in the calculations for each enrichment protocol.

The organising laboratory, using all recorded data shall determine which results are suitable for use in analysing the data. The organizing laboratory shall examine the raw data and other information requested in the data sheet to ascertain that all collaborators have performed the analyses according to both the alternative and reference methods as written. When there is evidence that results might be obtained under inappropriate conditions and/or the methods have not been followed strictly these or all results from the collaborator are excluded for further analysis. When the interlaboratory study is completed, all the information on data sheets and the results shall be submitted to the organising laboratory and examined as follows: 

Disregard data from collaborators if transit conditions and times fall outside the specified acceptable tolerances (the limits for transport time and temperature have to be set before the samples are shipped);



Disregard data from collaborators that received samples/test kits, etc. that were damaged during transportation;



Disregard data from collaborators using media formulation that are not in accordance with the (reference) method;



Disregard data from collaborators if the questionnaire suggests that the laboratory has deviated from either the standard protocol or the critical operating conditions.

Calculations The results obtained in the collaborative study are summarised in the Tables 6, 7 and 8. Table 7 and 8 are used twice, once for the unconfirmed results and once for the confirmed results of the alternative method. Table 6 – Positive results by the reference method Contamination level Collaborators

L0

L1

L2

Collaborator 1

/8

/8

/8

Collaborator 2

/8

/8

/8

Collaborator 3

/8

/8

/8

etc.

/8

Total

FP

/8 a

P1

/8 b

P2

c

a

False positive by the reference method Positive at level 1 by the reference method c Positive at level 2 by the reference method b

16

EN ISO 16140:2008 (E) Table 7 –Positive results (unconfirmed and confirmed) from the alternative method Contamination level Collaborators

L0

L1

L2

Collaborator 1

/8

/8

/8

Collaborator 2

/8

/8

/8

Collaborator 3

/8

/8

/8

etc.

/8

Total

FP

/8 a

P1

/8 b

P2

c

a

False positive by the alternative method Positive at level 1 by the alternative method c Positive at level 2 by the alternative method b

Table 8 - Comparison between the reference method and (unconfirmed and confirmed results) alternative method Alternative method

Reference method +

-

+

PA

PD

-

ND

NA

Total

N+

N-

Total

N

For level L0 and for the reference and alternative (both before and after confirmation) method, calculate the percentage specificity as follows: 

Specificity:

  FP   SP  1      100 % ;   N  

Where : N- is the total number of all L0 tests; FP is the number of false positive results.

For each of the levels L1 and L2 the sensitivity (SE) is calculated (both before and after confirmation of the laternative method results) as follows: 

Sensitivity:

P  SE   alternative   100 % ;  Preference 

Where : N+ is the total number of all L1 or L2 tests; P is the number of positive results for levels L1 or L2.

Note: The sensitivity depends on the level of contamination of the samples. Comparison of sensitivity values between the reference and alternative method can only be made within an interlaboratory study and not between different interlaboratory studies. Based on the results presented in Table 8, the relative accuracy is calculated as follows:

17

(PA + NA)

EN ISO 16140:2008 (E)



Relative accuracy:



Where :



N is the total number of tested samples PA is the number NA is the number of negative agreements.

AC =

N

× 100 % ;

(for the of

levels L0, positive

L1

or

L2 tests; agreements;

The relative accuracy is calculated before and after confirmation of the alternative method results. It should be indicated for each value of AC whether this is done using confirmed or unconfirmed results and whether the study was based on paired or unpaired data. Interpretation of the data Examine the discordant results as described in annex F (The McNemar test), by using the count of PD and ND from table 8 (see 5.2.2). When the values for PD and ND are high and almost equal, no statistical difference between the methods can be detected using the McNemar test. In this case, the organising laboratory shall pay further attention to explain the reasons for the high values of PD and ND. Moreover, it shows that the relative accuracy of a method shall never be interpreted by taking into account only the McNemar test. Note: In addition an evaluation is made for the difference between the relative limit of detection (RLOD) between laboratories. This is done according to the method described in Annex H. As there is limited experience with the interpretation of these results, the results are used only for informative purposes.

18

EN ISO 16140:2008 (E) Quantitative methods - Technical protocol for their validation Methods comparison study The Methods Comparison Study is the part of the validation process that is performed in one laboratory and consists of an inclusivity/exclusivity study of the alternative method, together with a comparative study of the results of the reference method to the results of the alternative method in contaminated samples. Selection of Food and other categories to be used. The selection of food categories and types used within the validation, will depend on the type or group microorganism and the scope of the validation

of

If the method is to be applied for "all foods" then five categories of food should be studied. The validation study report will state the food categories used in the study. If the method is to be validated for a restricted number of food categories, e.g. 'meat products' and 'milk and dairy products' then only these categories require study. In addition to food categories, feed samples, environmental samples and primary production stage samples may be included as additional categories. This will broaden the application of the use of the alternative method for these additional categories. For all selected categories (food and others) three different (food) types per category shall be included in the study. Annex B presents an overview of the relevant (food) types and (food) categories per type of microorganism, that might be relevant for the validation. When selecting samples for the study it is of the highest priority to find those that are naturally contaminated. If it is not possible to acquire a sufficient number of naturally contaminated samples, artificial contamination of samples is permissible (see annex C and D). It is desirable that food samples come from as wide a range of contamination as possible in order to reduce any bias from local food specialities and broaden the range of validation. It shall be ensured that with the selection of the different (food) types both high and low (natural) background microflora, different types of stresses due to processing and raw (unprocessed) (food) items are being included in the study. An example could be for the validation of a method for enumeration of Listeria monocytogenes: 

For the food category milk(products), the food types (1) raw unprocessed milk (high background flora, unstressed), (2) dried milk products (low background flora, desiccation stress) and (3) pasteurised milk products (low background flora, heat stress).



For the food category meat(products), the food types (1) cooked meat products (lower background flora, heat stress), (2) fermented meat products (high background flora, pH and aw stress) and (3) raw cured meat products (lower background flora, aw stress)

Note: Annex B should be used to facilitate the selection of food types and items for the type of micro-organism involved. It should not be regarded as a mandatory choice. Number of samples and test sample preparation For each (food) category being examined there should be 3 (food) types, each type should be represented by 6 samples. Of the 6 samples, there should be 2 at low level, 2 at intermediate level and 2 at high level of of contamination of the selected (food) type. contamination. These levels should cover the whole range Of each sample 4 replicates, from the same dilution serie, should be used (see Figure 1). So the total number of analysis for each category will be 72 (3 types x 3 levels x 2 samples x 4 replicates). Note: The 2 samples at each level might be different belonging to the same (food) type but not necessarily the same (food) item. For example one sample might be full fat milk powder, the other infant formula; belonging to the same food type (dried milk products) but not the same food item (see Annex B). Figure 1 presents the experimental design for one category and 3 levels of contamination. Figure 1 - Diagram of the experimental design for a (food) category assuming 3 levels of contamination.

19

EN ISO 16140:2008 (E)

Some naturally contaminated samples may contain high numbers of target analyte, and this can result in difficulty in achieving the required range of contamination. In such cases the naturally contaminated sample can be ‗diluted‘ with uncontaminated material of the same (food) item. The reference and alternative methods shall be performed with, as far as possible, exactly the same sample. For some alternative methods it is of interest to determine the limit of quantification (LOQ). The (lower) LOQ is only relevant when the measurement principle of the alternative method is not based on the visual observation of the target micro-organism. Examples of methods for which the LOQ needs to be determined are the instrumental measurement of conductivity or fluorescence which is related to the growth of the micro-organism. In case the LOQ needs to be determined an additional level of contamination is added to one of the three types per category used in the method comparison study. The additional level is targeted to verify the lower limit of quantification of the alternative method. The other 3 levels remain more or less the same, so low, intermediate and high level of contamination representative for the selected food type. So for the selected food type a total of 32 (4 levels x 2 samples x 4 replicates) analysis will be done instead of 24 (3 levels x 2 samples x 4 replicates). Presentation of Results Tabulate the results as in Table 9 based on log transformed counts. Plot a graph of the data for each food type and each food category (i.e. combine the data from the food types in each category and plot a graph) using the y-axis (vertical) for the alternative method and the x-axis (horizontal) for the reference method. The points at each contamination level should form a discrete cluster. In order to detect outliers and a non-linearity, visually check the graph for the presence of any abnormal results, that is those that are obviously outside each cluster. If any are present, discard temporarily that result and repeat the calculations below in order to estimate its effect in contrast to the calculations with all the data. If a dilution effect is involved, examine this issue cautiously.

Table 9: Results of the accuracy study (in log cfu/g). (Food) Category

(Food) Type

(Food) Item

Reference method Rep. 1 (x1)

Category 1

Type 1

Rep. 2 (x2)

Rep. 3 (x3)

Alternative method Rep. 4 (x4)

Rep. 1 (z1)

Rep.2 (z2)

Rep.3 (z3)

Rep. 4 (z4)

Sample 1 Sample 2

20

EN ISO 16140:2008 (E)

Type 2

Type 3 ...

... Type 1-3

Category x

Sample 3 Sample 4 Sample 5 Sample 6 * Sample 7 * Sample 8 Sample 1 Sample 2 ... Sample 6 Samples 1–6 ... Samples * 1–8

* The additional sample 7 and 8 are only needed for the determination of the LOQ of instrumental like alternative methods. Accuracy profile for method comparison study data. The principle of the accuracy profile is explained in more detail in Annex I. Calculation procedure All computations can be summarized as the following sequence of operations. The notation is: i the number of (food) item samples (1 ≤ i ≤ 6); j the number of replicates (1 ≤ j ≤ 4); xij results obtained by the reference method, and zij results measured by the alternative method, after log transformation. For each (food) item sample or just sample, measurements are made under repeatability conditions for both methods. Thus, zi values are assumed to be normally distributed and their β-expectation tolerance interval is computed according to Gutman [13]. All computations can be summarized as the following sequence of operations. Step 1: The acceptability criterion is set at ± log units. The acceptability criterion ± is expressed as a difference because of the use of the logarithms. Step 2: For each sample i, calculate Xi as the median of the log transformed counts obtained with the reference method, noted xij. These values are the reference values of the validation samples; Xi = median(xij) = Step 3: Calculate, for each sample i (using zij), the average zi using formula (Eq. 3). J

z zi 

ij

j 1

J (Eq. 3)

Step 4: Calculate, for each sample i (using zij), the standard deviation si using formula (Eq. 4). J

 (z si 

ij

 zi )

2

j 1

J 1

Step 5: For each sample i, compute the absolute bias as zi  X i

(Eq. 4)

; this is an estimate of the lack of trueness of

the alternative method when compared to the reference method;

21

EN ISO 16140:2008 (E) Step 6: For each sample i, compute the limits of the β-expectation tolerance interval (β-ETI) according to Gutman [13]; this β -ETI is the interval where the expected proportion of future results will fall is β. For each sample, β-ETI can be expressed as:

 1  zi  t 1 β ,J 1  si 1   J  2   (Eq. 5) Where t



1 β



is the percentile of a Student-t distribution for β the chosen probability, and J the number of

, J 1

2

replicates (4 as requested); it is the coverage factor of the β-ETI of the validation sample. It defines two limits noted Ui  zi  t



1 β 2



, J 1

 si 1 

1 J

;

Li  zi  t



1 β 2



, J 1

 si 1 

1

.

J

Step 7: For each food category/food type, tabulate the different values calculated for each food item as in Table 8. Table 10: Presentation of the statistical results of the comparison study (Food) Type

Type 1

(Food) Item Item 1

Reference value Xi

Average

Bias

zi

zi  X i

Upper TI limit Ui – X i

Lower TI limit Li – Xi

Upper accept +λ

Lower accept –λ

Item 2 Item 3 Item 4 Item 5 Item 6

Make a graphical representation of computed results as follows: - the horizontal axis is for reference values Xi in Log units; - the vertical axis is for the bias, the acceptability limits, and the tolerance interval limits Ui – Xi and Li – Xi, all expressed in log units as differences to the corresponding reference value of the sample. Make a graphic for each (food) category. (food) types of the same category can be combined into one figure. Interpretation Results are then illustrated in an accuracy profile graph like the example given in Figure 2. This graph is used as a graphical decision-support tool. The upper and lower tolerance-interval limits are connected by straight lines to interpolate the behaviour of the limits between the different levels of the validation samples. The horizontal line represents the reference values obtained with the reference method. The differences between reference values and average levels of contamination, determined by the alternative method, zk , are represented by black dots. Whenever no biases exist, these recovered values are located on the horizontal reference line. In addition, acceptability limits are represented by two dashed horizontal lines and β-ETI limits as broken full lines. For the moment the tentative acceptability limit is set at + 0.5 log units. The alternative method is regarded as being equivalent to the reference method when the values for the β-ETI fall within the acceptability limits for all levels of contamination. When the values for the β-ETI fall outside the acceptability limits for all levels of contamination the alternative method is not regarded is being equivalent to the reference method. In cases where the β-ETI is partly outside the acceptability limit then the application of the alternative method can be restricted to a smaller contamination range or further tests should be done in order to explain the results. In case of instrumental methods, where an additional level of contamination is included at the level of the expected LOQ, the results will be included in the graph and used to estimate the LOQ. The LOQ can be located and precisely determined by interpolation where β-ETI limits and acceptance limits lines are intersecting. The LOQ is compared to the theoretical LOQ of the alternative method.

22

EN ISO 16140:2008 (E) Prepare a report as illustrated in Annex J. Figure 2: Example of accuracy profile for a food type in the method comparison study.

0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.1

1.5

2.0

2.5

3.0

3.5

4.0

4.5

Accuracy (difference of Log)

-0.2 -0.3 LOQ

-0.4

Contamination level Bias

Lower TI

Upper TI

Lower Acceptability

Upper Acceptability

Inclusivity and Exclusivity Inclusivity and exclusivity testing is not required for general enumeration methods such as total plate count (TPC) and yeast & mould (Y&M) methods. It is required for enumeration methods designed to detect specific microorganisms (e.g. Listeria). Selection and number of test strains A range of strains should be used. Criteria for selecting test strains are given in annex E. The strains used should take into account the measurement principle of the alternative method (e.g. culture based, immunoassay based, molecular). Different measurement principles may require the use of different test panels of strains. Each strain used should be characterised biochemically and/or serologically and/or genetically in sufficient detail for its identity to be known. Strains used should preferentially have been isolated from foods, feeds or the food processing environment, or primary production taking into account the scope of the validation. However, clinical, environmental and culture collection strains can also be used. The original source of all isolates should be known and they should be held in a local (e.g. expert laboratory), national or international culture collection to enable them to be used in future testing if required. For inclusivity testing at least 50 pure cultures of microorganisms should be tested. For exclusivity testing at least 30 pure cultures of microorganisms should be tested. Target microorganisms (Inclusivity) The inoculum level should be at least 100 times greater than the minimum detection level of the alternative method being validated and the protocol of the alternative method shall be used. When negative or doubtful results are obtained, the test should be repeated and the reference method included to check that the strain could be detected with the appropriate reference method. If results are negative, consideration could be given to repeat the test with the addition of a food matrix.

23

EN ISO 16140:2008 (E) Non-target microorganisms (Exclusivity) Each test is performed once and only with the alternative method. The inoculum level should be similar to the greatest level of contamination expected to occur in any of the food categories being used. No food sample is added. The pure culture should be grown in a non-selective broth under optimal conditions of growth for at least 24 h and diluted to an appropriate level before testing is begun. If the alternative method gives a positive or doubtful result, then the strain should be tested with the reference method, to check if the strain gives a valid result. Expression and interpretation of results Tabulate the results as in Table 11. Table 11 — Presentation of the results for the inclusivity and the exclusivity. Micro-organisms

Alternative method Expected result

Actual result

Target strains (Inclusivity) 1 2 etc. Non target strains (Exclusivity) 1 2 etc.

The interpretation of the results shall be done by the laboratory in charge of the methods comparison study. The report should state any anomalies from the expected results.

Interlaboratory study General The aim of the collaborative study is to compare the performance (accuracy and precision) of the reference method to the alternative method by different collaborators using identical samples examined under reproducibility conditions and to compare these results with preset criteria for the acceptable difference between the reference method and the alternative method. Where ever possible the study conditions should reflect the normal variation between laboratories. Measurement protocol and samples The interlaboratory study shall produce 8 valid data from at least 8 collaborators. A collaborator is defined as an individual laboratory technician, who works completely independently from other collaborators, using their own samples and reagents. Normally these 8 collaborators originate from 8 different laboratories including data set(s) produced by the organising laboratory; however different collaborators may come from the same laboratory if necessary. In these cases the collaborators shall come from a minimum of five different organisations (note: laboratories in different locations but belonging to one company or institute, are accepted as different organisations). Technicians involved in the preparation of the samples used in the collaborative study, shall not take part in the testing of those samples within the collaborative study. The accuracy and precision estimates should be calculated from a large number of duplicate test results. This figure should be a minimum of 96 results for the one (food) matrix chosen, consisting of 8 collaborators, 3 levels of contamination, 2 methods of enumeration (reference and alternative) and duplicate measurements, i.e.8×3×2×2 = 96. General guidelines for conducting the interlaboratory studies are described in ISO 5725-2. The organizer is responsible for the preparation of the test protocol and a data sheet (see below) for recording of all experimental data and critical experimental conditions used by each laboratory. It is necessary for each collaborator to

24

EN ISO 16140:2008 (E) demonstrate his competence in the use of the alternative method and of the reference method prior to participating in the study proper. The protocol is the following: 

A relevant food item (for selection see annex B) is used to prepare the test samples. The food should contain a natural background microflora.



The selected food item may be inoculated with the target organism. The protocol for inoculation of the samples shall be appropriate for the selected food item. Samples shall be prepared to ensure homogeneity between samples using matrix preparation protocols contained in Annex C and D. In general liquid samples (compared to solid samples) gives greater assurance to obtain homogeneity. The samples shall be shown to be homogeneous by the organising laboratory. More information about testing and criteria for homogeneity is given in ISO/TS 22117 [3].



At least three different levels of contamination shall be used. The analyte concentrations should be chosen to cover at least the lower, middle and upper levels of the entire range of the alternative method. A negative control level should be included in addition.



Duplicate analyses are done by each collaborating laboratory at the three levels of contamination. All samples should be blind coded to ensure that the analysts are not aware of their level of contamination.



The analysis of samples shall be performed in each laboratory at the stipulated date.

The organising laboratory, using all recorded data shall determine which results are suitable for use in analysing the data. The organizing laboratory shall examine the raw data and other information requested in the data sheet to ascertain that all collaborators have performed the analyses according to both the alternative and reference methods as written. When there is evidence that results might be obtained under inappropriate conditions and/or the methods have not been followed strictly these or all results from the collaborator are excluded for further analysis. No outlier tests are performed on the selected data Analysis of Data The results of the different collaborators for both the reference and alternative method are presented in Table 12.

Table 12 — Presentation of the results of the interlaboratory study per each analyte level (k).

Collaborators (i) 1 2 Etc. (l)

Level (k) Blank Blank Blank Blank

1 2 etc. (I)

Low Low Low Low

1 2 etc. (I)

Medium Medium Medium Medium

1 2

High High

Reference method xijk Result

Duplicate 1

Duplicate 2

Alternative method zijk Result

Duplicate 1

Duplicate 2

25

EN ISO 16140:2008 (E) etc. (I)

High High

Validation characteristics calculation In microbiology, the data are not always distributed as a normal distribution, i.e. a Gaussian distribution. In order to get a more symmetric distribution, sparse counts are better transformed into decimal logarithms. Other more complicated procedures can also be used. Let note the data as follows, zijk, the logarithmic transformation of microbiological enumeration of collaborator i for replicate j on level k, with 1 ≤ i ≤ I, 1≤ j ≤ J and 1≤ k ≤ K using the reference method; xijk the logarithmic transformation of microbiological enumeration of collaborator i for replicate j on level k, with 1 ≤ i ≤ I, 1≤ j ≤ J and 1≤ k ≤ K using the alternative method It was also demonstrated that the logarithmic transformation of microbiological counting can be satisfactory to correctly approximate the normality assumption of ISO 5725 [4]. In addition, the logarithmic transformation enables to stabilize the repeatability and reproducibility variances over the contamination levels. Therefore, the classical (parametric) calculation procedure described by ISO 5725-2 [5] will be used to compute precision criteria, i.e. repeatability and reproducibility standard deviations. It is often difficult to make reliable estimations (average, standard deviation, etc.) with a small bias and in presence of outliers. ISO 5725-2 includes outlier tests (Cochran, Dixon, Grubbs) in order to discard the badly influencing values and to obtain a better estimate; this however reduces the number of useful values for statistical analysis. Therefore, in order to guard against these difficulties, no outlier rejection based on significance test will be performed. But, as stated in ISO 5725-2, the organizer shall perform an initial step to scrutinize data, based on, for example, a graphical representation. Accuracy profile for the interlaboratory study The principle of the accuracy profile is explained in more detail in Annex I. Calculation procedure All computations can be summarized as the following sequence of operations. The notation is: i the number of collaborators (1 ≤ i ≤ I); j the number of replicates (1 ≤ j ≤ J); and k the number of levels (1 ≤ k ≤ K). For each level, note xij results obtained by the reference method and zij results are measured by the alternative method, after log transformation for both. Detailed calculations can be obtained in Rozet [6] or Hubert et al. [7]. Step 1: The acceptability criterion is set at ±  log units. The acceptability criterion ±  is expressed as a difference because of the use of the logarithms. Step 2: For each level of contamination, calculate X(k) as the median of the log transformed counts obtained with the reference method, noted xijk. These medians are called reference values of the samples used in the validation study; Xk= median(xij(k)) Step 3: Calculate, for each level k (using zijk), the reproducibility standard deviation sR using formula {Eq. 6} This classical statistical procedure is fully described in ISO 5725-2.

skR 

skL  skr 2

2

(Eq. 6)

Step 4: For each level, compute zk the global average of measurements made with the alternative method; Step 5: For each level, compute the absolute bias as zk  X k ; this is an estimate of the lack of trueness of the alternative method when compared to the reference;

26

EN ISO 16140:2008 (E) Step6: For each level, compute the limits of the β-ETI according to Mee [8]; the β-ETI is the interval where the expected proportion of future results will fall is β. NOTE: It is necessary to use another method for the calculation of the β-ETI than for the method comparison study (see 6.1.4.1) because measurements are made under reproducibility conditions in the interlaboratory study and under repeatability conditions in the method comparison study.

a) For each level, β-ETI can be expressed as:

 zk  kkM  skR  (Eq. 7) Where kkM represents a coverage factor of the given level k. b) The coverage factor for the k level is equal to:

1  1   1  2 2  I  J G 

k kM  Qt  ;

(Eq. 8) Where: Qt is the percentile of a Student-t distribution, β the chosen probability (80%), I the number of collaborators, J the number of replicates (2) and ν the number of degrees of freedom. Intermediate parameters are:

G

H 1 J H 1 (Eq. 9) 2

H

skL 2

skr (Eq. 10) c) The number of degrees of freedom is computed as follows:

 

 H  1

2

1 H  1    1 J  J  2

I 1

I J

(Eq. 11) Step 7: For each level, compute the differences between the limits of the tolerance interval and the target value Xk, i.e.  zk  kkM  skR   X k  . Step 8: Make a graphical representation of computed results as follows: - the horizontal axis is for target values Xk in Log units; - the vertical axis is for the bias zk  X k , the acceptability limits ± and the tolerance interval limits

 z

k

 kkM  skR   X k  , all expressed in Log units as differences to the corresponding target value T for each

level of contamination. Interpretation of the accuracy profile and acceptability limits. Results are then illustrated in an accuracy profile graph like the example given in Figure 3. This graph is used as a graphical decision-support tool. The upper and lower tolerance-interval limits are connected by straight lines to interpolate the behaviour of the limits between the different levels of the validation samples. The horizontal line represents the reference values obtained with the reference method. The differences between reference values and average levels of contamination, determined by the alternative method, zk , are represented by black dots.

27

EN ISO 16140:2008 (E) Whenever no biases exist, these recovered values are located on the horizontal reference line. In addition, acceptability limits are represented by two dashed horizontal lines and β-ETI limits as broken full lines. For the moment the tentative acceptability limit is set at + 0.5 log units. The alternative method is regarded as being equivalent to the reference method when the values for the β-ETI fall within the acceptability limits for all levels of contamination. When the values for the β-ETI fall outside the acceptability limits for all levels of contamination the alternative method is not regarded is being equivalent to the reference method. In cases where the β-ETI is partly outside the acceptability limit then the application of the alternative method can be restricted to a smaller contamination range or further tests should be done in order to explain the results. An example of the application of the accuracy profile to interlaboratory studies is presented in annex K.

28

EN ISO 16140:2008 (E) Figure 3: General presentation of an accuracy profile for the validation of an alternative method (value for acceptability limit is chosen arbitrarily).

Accuracy (difference in log units) 0.4

Acceptability limit

0.3 Tolerance interval limit

0.2 0.1

Reference value (Log10)

0. 1.5 -0.1 -0.2

2.0

2.5

3.0

3.5

4.0

Average difference between reference and alternative method

-0.3 -0.4 -0.5

29

EN ISO 16140:2008 (E)

Annex A (normative) Specific rules for the acceptance of results already obtained in a prior validation scheme

A.1 General This Annex defines the technical conditions whereby validation study data generated using both International Standards and other reference methods (e.g. AOAC International) will be accepted.

Alternative methods which have not been validated against an International Standard Validation studies conducted using a reference method which is not an International Standard (e.g. ISO) will be acceptable under this standard provided: 

that they have been conducted according to validation protocols approved by a panel of technical reviewers and the results of such studies have been accepted by them;



that the technical reviewers were operating under the sponsorship of internationally recognised organisations performing method validations (for example AFNOR, MicroVal, NMKL/NordVal, AOAC International, AOAC Research Institute);



that the validation studies conform to at least the total sample number and food matrix requirements of this standard

Alternative methods which have changed and were previously validated to an International Standard If the alternative method has changed since it was validated, submit a description of the change to an expert review panel for a determination as to whether the change is considered to be major. A major change is one which an expert review panel concludes will adversely impact the accuracy of the test results produced by the alternative method. If the written report of the expert reviewer concludes that the change is major then the results cannot be accepted. If the changes are determined to be minor by the expert review panel and related to the method protocol, then a methods comparison study is conducted using 3 representative food matrices and/or environmental surfaces. Results of the study are presented to the expert review panel for review and approval. Other minor changes, like for example a modification in software, that do not affect the method protocol do not need to be validated by a method comparison study.

Certification Requirements Provided that the technical requirements stated in A.2 or A.3 as appropriate are satisfied, the results generated for the validation of the alternative method will be accepted technically. Specific certification procedures by which results which are in accordance with technical requirements of Annex A are accepted will be defined by individual certification bodies.

30

EN ISO 16140:2008 (E)

(informative) Classification of sample types for validation studies

31

EN ISO 16140:2008 (E)

32

EN ISO 16140:2008 (E)

33

EN ISO 16140:2008 (E)

34

EN ISO 16140:2008 (E)

35

EN ISO 16140:2008 (E)

(normative) Order of preference for use of naturally and artificially contaminated samples in validation studies.

This Annex gives the order of preference and information on the use of different types of samples in both method comparison studies and collaborative studies. st

1 option: nd

2 option:

Naturally contaminated samples Contamination by mixture If naturally contaminated samples are found to contain a level that is too high, then the concentration can be reduced by ‗dilution‘ of the naturally contaminated sample with a food of a similar type containing a normal background microflora. The mixed sample should be made so as to have a homogeneous distribution of the target micro organism.

rd

3 option:

Artificially contaminated samples Strains used for the artificial inoculation should, by preference, have been isolated from the same (food) item., and take into account the natural diversity of the target organism e.g. serotype, genotype, phenotype. The level of target microflora should be representative of the contamination which occurs in that product naturally. The (food) item should contain a ‗normal‘ background microflora.

th

4 option

Reference materials Reference materials, such as certified reference material, containing appropriate but well defined levels of target analyte (microorganisms) in a stable but stressed state, may be used to spike samples for analysis by both qualitative and quantitative methods. For qualitative studies their use should be limited when only a few strains or serotypes of food origin of the target analyte are available as reference materials.

36

EN ISO 16140:2008 (E)

(informative) General Protocols for Contamination by Mixture and Artificially Contaminating Food Matrices.

This Annex provides examples of how artificial contamination of (food) items can be done. Methods used by expert laboratories are not limited to methods shown here. For the artificial contamination of samples two possibilities are given, the first named seeding the other named spiking. The seeding is based on the contamination of natural samples by a diluted culture and subsequent storage of the sample for an extended period in order for the microorganism to adapt to the environmental conditions of the food. The spiking protocol is based on the application of a relevant stress conditions to a diluted culture and subsequently inoculation of the stressed culture into the food matrix. More information on the spiking methodology can be found in [14].

A: Contamination by Mixture X gram of naturally contaminated sample are mixed up with y gram of non-contaminated sample, in order to reach the desired level of contamination. Store the food sample contaminated by mixture at the appropriate storage temperature for that food matrix. Allow the microbial population to equilibrate in the food matrix for a minimum of 1 day before any analysis.

B: Artificially Contaminating Food Matrices using seeding protocol B-1: Artificial Contamination of High Moisture Foods with a Liquid (Broth) Culture Preparation of contaminating microorganism(s) by single sample inoculation 1) culture target strain Inoculate a tube of non selective enrichment broth with the designated strain. Incubate the broth at optimal conditions for the strain. 2) adjust level by dilution After incubation dilute the culture in a suitable diluent till the desired level(s). The level of dilution required is dependent on the (food) item to be inoculated, level of contamination, strain used and storage conditions of the food item. 3) inoculate into food by pipetting known volume or spraying known volume. The diluted culture is inoculated into the (food) item by spraying or pipetting. Inoculation of individual samples is preferred. The volume of the inoculum should be as low as possible as it should not influence the a w significantly. As a general rule 0.25 ml per 25 gram of sample is used. 4) mix to ensure homogeneity After inoculation the (food) item is mixed thoroughly to ensure homogeneity. In case the inoculum is added in steps, mixing should be done after each step. 5) apply a stress Apply an appropriated sample treatment to the samples such as for example: a) thermal treatment (e.g; 5 min at 50°C) by immersion in given temperature bath b) freezing treatment (e.g; 72h at -20°C) c) storage at 4°C (e.g. for1 week) 6) leave sample for stabilisation/stress Store the (food) item at the appropriate storage temperature (preferably the normal storage temperature) for that (food) item. Consider also the potential growth or survival of the organism during the storage period. Examples are: nuts would be stored at room temperature, orange juice would be stored at 2-8 ºC and ice cream would be stored frozen at less than 0 ºC. Allow the microbial population to equilibrate in the food matrix for a minimum of 2 days 37

EN ISO 16140:2008 (E) depending on the shelf life of the product involved. For example perishable foods should be stored for a minimum of 2 days, frozen foods for a minimum of 14 days. 7) check test sample for level of contamination After the samples have been stored for the appropriate time check the level of contamination before using the samples in the validation study.

B-2: Artificial Contamination of Low Moisture Foods with a Lyophilized Culture 1) prepare a lyophilised culture Inoculate a tube of non selective enrichment broth with the designated strain. Incubate the broth at optimal conditions for the strain. After incubation, collect the bacterial cells by centrifugation. Wash cells twice with a sterile buffered diluent. Repeat centrifugation and decant the supernatant. Resuspend the pellet into sterile 10% NFDM (non fat dried milk). Transfer resuspended cells into appropriate containers for lyophilization. 2) assess level of target organism Collect the lyophilized cell suspensions in a sterile container. Manually crush the lyophilized culture to create a homogenous fine powder before assessment of the level of contamination. Use a non selective method for the determination of the level of contamination and incubate the plates under optimal growing conditions. 3) inoculate the lyophilised culture into the food to attain the required level Mix 0.1 g of the lyophilized culture with 10 g of the uninoculated (food) matrix in, for example a sterile plastic bag. The bag is shaken until the inoculum appeared to be evenly distributed throughout the food matrix. Perform serial ten-fold dilutions with the (food) item (e.g. 1 g from first step with 9 g (food) item, etc.) to dilute the lyophilized culture to the appropriate level. Ensure that proper mixing occurs at each dilution level. 4) store food Store the (food) item at the appropriate storage temperature (preferably the normal storage temperature) for that (food) item. Allow the microbial population to equilibrate in the (food) item. Allow the microbial population to equilibrate in the (food) item for a minimum of 1 week before any analysis 5) check level After the samples have been stored for the appropriate time check the level of contamination before using the samples in the validation study

C: Artificially Contaminating Food Matrices using spiking protocol The spiking protocol exists of different steps as presented below 1) culture target strain Inoculate a tube of non selective enrichment broth with the designated strain. Incubate the broth at optimal conditions for the strain. 2) adjust level by dilution After incubation dilute the culture in a suitable diluent till the desired level(s). Injury protocols are usually done on 4 5 pure cultures with 10 to 10 cells/ml. 3) apply an injury protocol Apply an appropriated culture treatment, as for example: - heat treatment (e.g; 15 min at 50°C) by immersion in given temperature bath - freezing treatment (e.g; 72h at -20°C) - chemical treatment (e.g. treatment at high salt concentration or at low pH) - storage at 4°C (e.g. for 1 week minimum); 38

EN ISO 16140:2008 (E) Note: The conditions for applying stress strongly depends on the type of micro-organism and even the selected strain. The selected stress protocol should also resemble the stress of the micro-organisms found in the food sample to be used for spiking. 4) injury measurement Injury efficiency is usually evaluated by enumerating the pure culture on selective and non selective agars. More than 0,5 log CFU/g difference is expected for a sufficient stress application. 5) adjust level by dilution Dilute the culture in a suitable diluent till the desired level, in order to inoculate the matrix with less than 5 cells/g. 6) inoculate into food by pipetting known volume or spraying known volume. The diluted culture is inoculated into the (food) item by spraying or pipetting. Inoculation of individual samples is preferred. The volume of the inoculum should be as low as possible as it should not influence the water activity significantly. As a general rule 0.25 ml per 25 gram of sample is used. 7) mix to ensure homogeneity After inoculation the (food) item is mixed thoroughly to insure homogeneity. In case the inoculum is added in steps, mixing should be done after each step.

39

EN ISO 16140:2008 (E)

(informative) Example for presenting results of accuracy study of the method comparisons study for qualitative methods.

In the accuracy study part of the method comparison study for qualitative methods (see 5.1.1.1.3 for paired studies and 5.1.2.1.3 for unpaired studies) many data are generated. In order to have anoverview of the data different tables need to be filled. This annex gives an example of filled tables for both paired and unpaired studies and taking into account the effect of confirmation of results obtained by the alternative method. For paired data (see also section 5.1.1.1.3) an example of filled Tables 1 and 2 for a specific category (n = 60 samples tested) and the types within that category are given. The category is milk(products) and the types are raw milk cheese, milk powder and pasteurised milk. Milk(products)

Reference method positive (R+)

Reference method negative (R-)

Alternative method positive (A+)

26

4 (5)

Alternative method negative (A-)

2

28 (27)

1

Category

PA

NA

C

1

U

ND

2

PD

C

1

1

= confirmed and unconfirmed (between brackets) results of the alternative method

Sum (N)

U

1

2

Relative accuracy AC (%) C

1

U

N+

Relative sensitivity SE (%)

Relative sensitivity * ref. method SEref (%)

Relative sensitivity * alt. method SEalt (%)

N-

2

Relative specificity SP (%) C

1

U

2

Milk(products): Category - cheese - milk powder - past. milk

26 7 9 10

28 8 10 10

27 7 10 10

2 2 0 0

4 3 1 0

5 4 1 0

60 20 20 20

90 75 95 100

88 70 95 100

40

28 9 9 10

93 78 100 100

88 75 90 100

94 83 100 100

32 11 11 10

88 73 91 100

84 64 91 100

EN ISO 16140:2008 (E) For unpaired data (see also section 5.1.2.1.3) an example of filled Tables 4 and 5 for a specific category (n = 60 samples tested) and the types within that category are given. The category is meat(products) and the types are cooked ham, salami and raw minced meat.

Meat(products)

Reference method positive (R+)

Reference method negative (R-)

1

Alternative method positive (A+)

26 (26)

Alternative method negative (A-)

1 (1) 1

7 (13)

1

1 1

31 (37)

= confirmed and unconfirmed (between brackets) results of the alternative method

Confirmed results

Category

PA NA ND PD Sum (N)

Unconfirmed results

Relative accuracy AC (%)

N+

Relative sensitivity SE (%)

N-

87 92 92 75

32 12 8 12

78 83 100 58

39 12 15 12

Relative PA NA ND PD Sum (N) specificity SP (%)

Relative accuracy AC (%)

N+

Relative sensitivity SE (%)

N-

Relative specificity SP (%)

80 80 96 63

39 15 9 15

67 67 100 47

32 9 14 9

97 100 100 89

Milk(products): Category - cheese - milk powder - past. milk

25 10 8 7

37 12 14 11

2 0 1 1

7 2 0 5

71 24 23 24

95 100 93 92

41

26 10 9 7

31 9 14 8

1 0 0 1

13 5 0 8

71 24 23 24

EN ISO 16140:2008 (E)

(normative) Points to be considered when selecting strains for testing selectivity

A.2 General This annex outlines the minimum test requirements for general use. In the selection of test strains the majority should originate from the range of food materials used in the study and cover the recognised range of the target analyte with respect of the following – diversity in identification characteristics e.g. biochemical, serotype, phage type, etc., geographical distribution, incidence, and any other claims made by the producers of the alternative method.

Target group categories Undefined group for example total count, coliform, yeast, lactic acid bacteria; family for example Enterobacteriaceae; genus for example Salmonella, Pseudomonas, Listeria Undefined group for example total count, coliform, yeast, lactic acid bacteria; family for example Enterobacteriaceae; genus for example Salmonella, Pseudomonas, Listeria; species for example Listeria monocytogenes, Staphylococcus aureus, Escherichia coli; strain for example Salmonella enteritidis phage type 4.

According to the target group specified in E.2 a range of positive microorganisms can be chosen a)

For undefined groups for which the target group is defined by the reference method, the strains used shall be selected from those capable of typical growth in the reference method;

For families: use strains from a range of genera in that family and if possible include a representative member of all genera in the family; For genera: use a range of species from that genus and if possible test as many species as possible in the genus; For species: a range of strains from that species. For the selection of strains other more detailed ways for subtyping need to be considered. For example Salmonella and Listeria are serotyped and phage typed. In the future other (genetic) typing methods will be used. In defining the positive strains to be used, organising laboratories should use available up to date information to ensure that strains are relevant, at the time of testing, to the target (food) categories; For strain: a range of sources of that strain.

Non target groups used in selectivity study a)

The non-target groups (that is those expected to be negative and being used for cross reactivity tests) should be specified according to the target group;

When the target group is a family: non-target strains shall include families;

42

EN ISO 16140:2008 (E) When the target group is a genus: non-target strains shall include other genera considered to be similar to the target genus; When the target group is a species: non-target strains shall include other species within the target genus; When the target group is a strain: non-target strains shall include other strains within the same species.

43

EN ISO 16140:2008 (E)

(normative) Test applied to the examination of discordant results.

Count the total number of discordant results Y as follow: Y = PD + ND (for example PD = 2, ND = 10, then Y = 12). PD and ND coming from Table 1 (5.1.1.3.1). Check if the two methods could be different for the balance of sensitivity versus specificity: 

for Y < 6, (less than six disagreements): no test is available;



for 6  Y  22, (i.e. between 6 and 22 disagreements), determine m as the smallest of the two values of PD and ND (for example m = PD = 2, because PD < ND) and use the binomial law according to the following Table F.1:

if m  M for a given Y, the two methods are different at  < 0,05 (2-sided). Table F.1 - M values for Y disagreements (6  Y  22) Disagreements Y = PD + ND M = Max(m) for  < 0,05

6 to 8

9 to 11

12 to 14

15 to 16

17 to 19

20 to 22

0

1

2

3

4

5

For example, for Y = 12 disagreements and m = 2, M = 2 and m  M: thus, the two methods are different with p < 0,05. 

for Y > 22, (more than 22 disagreements), use the McNemar test with the chi-square distribution for 1 degree of freedom:

2 = d2/Y, with d = PD - ND and Y = PD + ND The two methods are different at  < 0,05 (2-sided) if  > 3,841. 2

This chi-square test corresponds to the minimal d for each Y of the following Table F.2 for  < 0,05. (That is for a given Y, d shall be equal or superior to the value given in Table F.2 for concluding that the two methods are different.) Table F.2 - d values for Y disagreements (Y > 22) Disagreements Y = PD + ND d = PD - ND 

22 to 26

27 to 31

32 to 37

38 to 44

45 to 51

52 to 58

10

11

12

13

14

15

44

EN ISO 16140:2008 (E)

(informative) Example calculation of RLOD in a method comparison study and an interlaboratory study.

Method comparison study Food category level x (cfu/g) ntot npos ref npos alt Milk and dairy products 1 0.0112 6 3 0 Milk and dairy products 2 0.0224 6 2 0 Milk and dairy products 3 0.03733 6 4 3 Milk and dairy products 4 0.06589 6 5 4 Milk and dairy products 5 0.1044 6 6 6 Meat and meat products 1 0.00995 6 1 1 Meat and meat products 2 0.01327 6 3 1 Meat and meat products 3 0.02475 6 5 3 Meat and meat products 4 0.03465 6 6 2 Meat and meat products 5 0.0495 6 6 5 Meat and meat products 6 0.0596 6 6 5 Meat and meat products 7 0.0892 6 6 6 Eggs and derivates 1 0.01224 6 1 1 Eggs and derivates 2 0.018 6 2 2 Eggs and derivates 3 0.03213 6 4 4 Eggs and derivates 4 0.05623 6 6 5 Eggs and derivates 5 0.08837 6 6 6 Fish and seafood products 1 0.01045 6 0 2 Fish and seafood products 2 0.01393 6 2 1 Fish and seafood products 3 0.01587 6 5 2 Fish and seafood products 4 0.0256 6 4 4 Fish and seafood products 5 0.02777 6 6 2 Fish and seafood products 6 0.04 6 5 5 Fish and seafood products 7 0.04533 6 6 4 Fish and seafood products 8 0.0832 6 6 5 Fish and seafood products 9 0.15867 6 6 6 Feeding stuffs 1 0.0142 6 2 2 Feeding stuffs 2 0.02367 6 2 3 Feeding stuffs 3 0.03787 6 6 5 Feeding stuffs 4 0.0843 6 6 6 Note: number of samples in this dataset is smaller than required in this standard (60 per food category) For statistical analysis the vectors with numbers of positive samples for ref and alt is combined, and the other information is duplicated to obtain a dataset with twice the number of rows. The data have been analyzed by a specific generalized linear model (GLM), which are commonly available in statistical software. For this example GenStat has been used. See McCullagh and Nelder [10] for general information on GLMs. The numbers of positives are described by a binomial distribution. The expected fraction of positives ( p  E (npos ntot ) ) is linked to the number of expected cfu in the sample by p  1  exp    and additive

z  ln( ) are used. This implies a complementary log-log relation between z and p: z  ln    ln  ln 1  p . The options binomial distribution and complementary log-log link function can be

models for

specified as inputs in the statistical program.

45

EN ISO 16140:2008 (E) Models to be used for z depend on whether the level of contamination of the sample is known or unknown. We here present results for both possibilities. Contamination levels x known The following models are relevant: 1. Single category:

z  ln( s  x)  D where: s is the sample size; x is the contamination level of the sample; D is the difference between the alternative and the reference method, estimated by the program. For the term ln( s  x) no parameter has to be estimated. This part is known as the offset, it can be calculated separately and put into the GLM program as an offset variable. 2. Combined categories, model with interaction

z  ln( s  x)  D  M m  ( D.M ) m In addition to D parameters Mm and (D.M)m are estimated for all categories m. 3. Combined categories, model without interaction

z  ln( s  x)  D  M m The significance of the interaction is tested by comparing the fit of models 2 and 3 with a log-likelihood test. 4. Combined categories, model without category effects

z  ln( s  x)  D (same model as 1, but here applied to combined data) The significance of the categories main effects is tested by comparing the fit of models 3 and 4 with a loglikelihood test. Model 1 is fitted to the data of the separate categories to derive RLODs: RLOD  exp(  D) . Models 2-4 are fitted to the combined data, and first used to test for significance of the interaction and the categories main effects. Estimates for D from models 3 and 4 can be used to calculate general RLODs. This is not possible for model 2 (here also the interaction terms would need to be included; if this would be done this would result in the same RLOD estimates as obtained with model 1 from single category data). Results are shown in Table F1. RLOD values between categories vary between 1.0 and 2.6, with upper 95 % confidence limits between 2.3 and 4.5. If for example an acceptability limit of 4 had been set, then the method would not have been sufficiently validated for minced meat and raw milk. Larger numbers of samples would improve this situation by leading to expected narrower confidence intervals. The likelihood ratio tests reveals no significant category effects (interaction nor main effects). Therefore the combined RLOD estimate from model 4 is relevant, which is 1.7 with an upper confidence limit of 2.2. Table F1. Results for estimating RLOD when sample contamination levels x are known. c.i. = confidence interval. pval = p values of likelihood ratio test testing for significance of RLOD different from 1 or for significance of category effects. Category Milk and dairy products 46

RLOD (90% c.i.)

pval

2.0 (1.0 – 4.1)

0.07

EN ISO 16140:2008 (E) Meat and meat products

2.6 (1.4 – 4.5)

0.004

Eggs and derivates

1.2 (0.6 – 2.3)

0.68

Fish and seafood products

2.0 (1.2 -3.3)

0.02

Feeding stuffs

1.0 (0.5 – 2.3)

0.94

test for interaction method.categories

0.36

test for significance of categories main effect

0.12

Combined categories (model 3)

1.8 (1.4 – 2.3)

< 0.001

Combined categories (model 4)

1.7 (1.3 – 2.2)

< 0.001

Contamination levels x unknown The following models are relevant: 5. Single category:

z  ln( s)  Li  D Parameters Li are fitted estimates of the contamination level of the samples. 6. Combined categories, model with interaction

z  ln( s)  Li  D  ( D.M ) m 7. Combined categories, model without interaction

z  ln( s)  Li  D (same model as 5, but here applied to combined data) Model 5 is fitted to the data of the separate matrices to derive RLODs: RLOD  exp(  D) . Models 6 and 7 are fitted to the combined data, and first used to test for significance of the interaction of method with category. Note that a test for category main effects is not possible in this case, because all sample contamination levels had to be estimated. Estimates for D from model 7 can be used to calculate a general RLOD. Results are shown in Table F2. RLOD values between matrices vary between 1.2 and 4.0, with upper 95 % confidence limits between 3.5 and 10. Note that this is much higher than in the case where x information was used. If for example an acceptability limit of 4 had been set, then the method would not be validated for minced meat and raw milk. Larger numbers of samples would improve this situation by leading to expected narrower confidence intervals. The likelihood ratio tests reveal no significant matrix effect. Therefore the combined RLOD estimate from model 7 is relevant, which is 2.2 with an upper confidence limit of 3.0. Table F2. Results for estimating RLOD when sample contamination levels x are unknown. c.i. = confidence interval. pval = p values of likelihood ratio test testing for significance of RLOD different from 1 or for significance of category effects. Category

RLOD (90% c.i.)

pval

Milk and dairy products

2.6 (0.9 – 7.5)

0.04

Meat and meat products

4.0 (1.6 – 10)