When food manufacturers specify a maximum limit for the amount of foreign material (FM) in the lot, handlers estimate the true percent FM in a commercial lot by measuring FM in a small sample taken from the lot before shipment to a food manufacturer. Because of the uncertainty (variability) in FM among samples taken from the same lot, it is difficult to obtain a precise estimate of the true FM in the lot. The objectives of this study were to (1) measure the variability and FM distribution among sample test results when estimating the true lot proportion of FM in a lot of shelled peanuts, (2) compare the measured variability and FM distribution among sample test results to that predicted by the binomial distribution, (3) develop a computer model, based upon the binomial distribution, to evaluate the performance (buyer's risk and seller's risk) of sampling plan designs used to estimate FM in a bulk lot of shelled peanuts, and (4) demonstrate with the model the effect of increasing sample size to reduce misclassification of lots.
Eighty-eight samples, 9 kg (20 lb) each, were selected at random from each of six commercial lots of shelled medium runner peanuts. The percent FM (PFM), based upon number of kernels was determined for each sample. The mean, variance, and distribution among the 88 sample test results were calculated for each of the six lots. Results indicated that the variance and distribution among the 88 sample test results are very similar to that predicted by the binomial distribution. The performance of various sampling plan designs was demonstrated using the binomial distribution.
As a natural process related to harvesting, handling, and curing, farmers' stock (FS) peanuts may contain a variety of foreign material (FM) such as dirt, sticks, rocks, grain seeds, metal objects, and glass (
Food manufacturers may specify that FM in a lot provided by a sheller not exceed a maximum limit or maximum concentration. Foreign material concentration may be expressed as the number of FM pieces per unit mass of peanuts or the maximum number of FM pieces in the entire lot. For example, the food manufacturer may specify that lots should not contain more that one piece of FM per 2000 pounds of peanuts. With food manufacturers specifying lower and lower limits for FM, shellers have two problems: (1) having the resources and/or technology to remove FM so that shelled lots do not exceed maximum limits specified by the food manufacturers and (2) how to accurately estimate the true level of FM in a processed lot to determine if contract specifications have been met. The sheller measures the FM in a lot at origin and the food manufacturer may measure the FM at destination to determine if the sheller has met contract specification. This paper will deal only with the second issue of how to get an accurate estimate of the true level of FM in the lot.
The sheller and food manufacturer estimate the FM contamination in the bulk lot by taking one or more samples from the lot and counting the number of FM pieces in a sample of a given mass. Because the number of FM pieces among replicate samples taken from the same lots will differ, the sheller can never determine the true proportion of FM in the lot with 100% confidence. Because of the variability among sample test results, some lots will be misclassified. There is a chance that samples from a good lot will test bad (false positive or seller's risk). There is a chance that samples from bad lots will test good (false negative or buyer's risk). It is important for the food manufacturer and sheller to know the effect of sample size (or number of samples of a given size) on the uncertainty associated with using samples to estimate the true proportion of FM a lot and how to reduce misclassification of lots relative to a limit specified by the food manufacturer.
The objectives of this study were to (1) measure the variability and FM distribution among sample test results when estimating the true lot proportion of FM in a lot of shelled peanuts, (2) compare the measured variability and FM distribution among sample test results to that predicted by the binomial distribution, (3) develop a computer model, based upon the binomial distribution, to evaluate the performance (buyer's risk and seller's risk) of sampling plan designs used to estimate FM in a bulk lot of shelled peanuts, and (4) demonstrate with the model the effect of increasing sample size to reduce misclassification of lots.
With the assistance of a peanut sheller in the southeastern U.S., 6 lots of shelled medium runner peanuts were identified for sampling to estimate the percent FM in each lot. Each lot was about 20 metric tonnes or about 45,400 pounds. Shelled medium runner peanut grades, defined by the Southeastern Peanut Association, are peanuts that are milled through a screen having 8.3 mm × 19.1 mm (21/64 by ¾ inch) openings and which will either meet an average of 40 to 50 count per ounce or which will pass through a screen having 7.1 mm × 19.1 mm (18/64 by ¾ inch) openings. A total of 88 samples, 9 kg (about 20 pounds) each, were taken from each lot. Using a scooping device, the 9 kg samples were taken at even intervals (about every 500 pounds) from the beginning to the end of the lot. Each 9 kg sample was identified with a lot number and sample number. Personnel from the Federal State Inspection Service weighed each sample, removed FM from each sample, and counted the total number of FM pieces (N) in each sample. The number of kernels in the 9 kg sample (n) was not counted, but estimated by multiplying the sample mass times the mean count per unit mass (industry standard of 1584 per kg (45 per oz) for shelled medium runner peanuts). The percent FM (PFM) for each sample was calculated by dividing the number of FM pieces (N) by the number of kernels (n) in the sample (N/n) and multiplying by 100. The lot number, sample number, sample weight, number of FM pieces, and percent FM was recorded in a database. The mean, m, and variance, s2, among the 88-sample test results for PFM was computed for each lot.
It was assumed that a peanut lot consisted only two types of objects, a peanut kernel and a piece of FM and the FM had physical characteristics similar to a peanut. The last assumption may be valid because any FM that remains in the lot after all shelling operations is probably very similar in physical characteristics to a peanut kernel or the FM piece would have been removed. It was assumed that the variance and FM distribution among the 88-sample test result per lot could be described by the binomial distribution (
From the binomial function, the probability of obtaining exactly k successes (k pieces of FM) in a sample of n kernels taken from a lot with a true proportion of FM pieces of p is described by Equation 1.
If a sample test result is expressed as the number of FM pieces (N), then average (μ) among replicate sample test result of n peanuts is n times p or np if the true proportion of FM pieces in the lot is p. The variance σ2 among replicate sample test results of n peanuts is np(1-p). If the number of FM pieces is expressed as a percent of the total number of peanuts in the sample (PFM = 100N/n), the mean and variance is described by equations 2 and 3, respectively.
From equation 3, it can be seen that the variance is a quadratic function of the mean μ. As p increases from 0 to 50%, the variance increases. As p continues to increase from 50 to 100%, the variance decreases. At p = 0 and 100%, the variance is zero and at p = 50%, that variance is a maximum.
The mean μ, variance σ2, and distribution among sample test results predicted by the binomial distribution were compared to the experimentally measured mean m, variance s2, and observed distribution among the 88-sample test results for each lot when the sample test result is expressed percent FM pieces (PFM). The comparisons were made to determine if the binomial function can be used to accurately predict the effect of sample size on the variability and distribution among sample test results. From the distribution of sample test results, the effect of sample size on the buyer's risk and seller's risk can be predicted (
The number of samples, average sample size in grams (g) and number of kernels (n), the sum of the number of FM pieces (ΣN) among all samples for each lot, and the estimated true proportion of FM pieces in each lot (p) is shown in
Number of samples, average sample size, and total number of foreign material (FM) pieces found in each lot of US medium runner peanuts.
The frequency distribution or the number of samples that contained 0, 1, or 2 pieces of foreign material is shown in
Number of samples (% of total samples) with 0, 1, or 2 foreign material (FM) pieces for each lot.
Observed and predicted variances among sample test results for each lots. The predicted variances were calculated from equation 3.
Another way to judge the suitability of the binomial distribution is to compare the observed cumulative distribution among sample test results (constructed from
Comparison the observed and predicted cumulative distributions for lot 4. The predicted distribution was calculated with the binomial function.
The Kolmogorov-Smirnov (K-S) goodness of fit (GOF) test was used to determine with 95% confidence if the observed distribution could have been sampled from a binomial distribution (
Maximum difference (Dmax) between the observed and predicted cumulative distributions and the critical difference (Dcritical) for 95% confidence limits and number of samples in the observed distribution. If Dmax < = Dcritical, then the null hypothesis that the observed distribution was sampled from a binomial distribution can't be rejected with 95% confidence limits.
An operating characteristic (OC) curve can be used to describe the performance of a sampling plan design given the sample size, n, and a foreign material accept/reject limit, fma. The accept/reject limit is usually equal to (but not required to be equal to) a defined foreign material tolerance, fmt. The seller of the lot may choose to use an accept/reject limit that differs from a tolerance specified by the buyer. Because of the variability among sample test results, there is a certain probability that a lot with a true PFM will be accepted or rejected (reject = 1.0 − accept) when measuring the foreign material, fm, in a sample. The fm in the sample is compared to the accept/reject limit fma and the lot is accepted or rejected depending on whether fm < = fma or fm > fma, respectively. A plot of the accept probability versus the true PFM (100p) in the lot is called an OC curve. A generalized OC curve is shown in
The binomial distribution can be used to predict the accept probabilities (OC curve) for a given sample size (n), accept/reject limit, fma when sampling a lot with a true PFM. The effect of sample size, accept/reject limit, and multiple samples on the buyer's risk and seller's risk are shown below using OC curves. All examples are for US medium runner lots (45 kernels per ounce) and a maximum limit specified by the buyer that the lot should not contain more than 1000 foreign material pieces in the entire lot (45,400 lb) of peanuts or a ratio of 1 FM piece per 45.4 lbs. Since the number of kernels in the lot is 32,688,000, the maximum limit (fmt) is equal to (1000/32,688,000)100 or 3.0592 × 10−3%.
The effect of increasing sample size from 45 to 90 to 180 to 360 pounds and using accept/reject limits of 1, 2, 4, and 8 FM pieces (PFM = 3.0592 × 10−3 %) is shown in
In the above example showing the effect of sample size (
Reducing the accept/reject limit relative to the maximum limit reduces the bad lots accepted or buyer's risk. However, reducing the accept/reject limit relative to the maximum limit also increases the good lots rejected or the seller's risk. For example, the probability of accepting a lot with 500 FM pieces when using an accept/reject limit of 4, 2, 1, and 0 FM pieces are 95, 70, 40, and 14 %, respectively. When using an accept/rejct limit lower than the maximum limit specified by the buyer, the seller runs a higher risk of rejecting good lots in order to reduce the risk of accepting bad lots.
The effect of taking multiple samples from a lot and averaging all sample test results is the same as increasing sample size (
As buyers specify lower maximum limits for total FM pieces in the lot, sample size will have to get larger and/or accept/reject limits much smaller than the maximum limit will have to be used to meet specified risk levels. At some point using samples to determine if a lot meets specifications becomes prohibitive. For example, specifying maximum limits of 1000, 100, and 10 FM pieces in a 45,400 lb lot requires a minimum sample size of 45.4, 454, and 4,540 lb, respectively, with an accept/reject limit of 1 FM piece. At low maximum limits, the handler will require electronic sorters to reliably remove FM from the lots with an efficiency approaching 100%.
The binomial distribution can also be applied to designing sampling plans for attributes other than FM such as damaged kernels, kernels with biotech traits, and kernels with spots. An interactive program is available on the Internet where the performance of sampling plans can be calculated given the design parameters (
The use of trade names in this publication does not imply endorsement by the USDA or the N.C. Agricultural Research Service of the products named nor criticism of similar ones not mentioned.
USDA, ARS, Box 7625, NC State University, Raleigh, NC 27695
Biological and Agricultural Engineering Department, Box 7625, NC State University, Raleigh, NC 27695-7625
Statistics Department (retired), Box 8203, NC State University, Raleigh, NC 27695-8203