首页 > how do we measure it

how do we measure it

发布时间：来源：文档文库

小中大

字号：

手机查看

NationalResearchCouncilCanadaInstitutefor
InformationTechnology
Conseilnational
derecherchesCanadaInstitutdetechnologiedel’information
Ifsoftwarequalityisaperception,howdowemeasureit?
W.M.Gentleman
SoftwareEngineeringLaboratoryJuly1996
NRCNo.40149

ThispaperwasoriginallypublishedinTheQualityofNumericalSoftware:AssessmentandEnhancement,RonaldBoisvert,ed.theProceedingsofIFIPWG2.5WorkingConference7,Oxford,UK,7-12July1996,Chapman&Hall,London,pp.32-43.
ThispaperwasalsopublishedinSoftwareQualityWorld-wide:WhatArethePracticesinaChangingEnvironment,ProceedingoftheSixthInternationalConferenceonSoftwareQuality(6ICSQ,Ottawa,Canada,28-30October,1996,pp.335-345.
Copyright1996by
NationalResearchCouncilofCanadaPermissionisgrantedtoquoteshortexcerptsandtoreproducefiguresandtablesfromthisreport,providedthatthesourceofsuchmaterialisfullyacknowledged.
Additionalcopiesareavailablefreeofchargefrom:
CommunicationsOffice
InstituteforInformationTechnologyNationalResearchCouncilofCanadaOttawa,Ontario,CanadaK1A0R6
Copyright1996par
ConseilnationalderecherchesduCanada
Ilestpermisdeciterdecourtsextraitsetde
reproduiredesfiguresoutableauxduprésentrapport,àconditiond’enidentifierclairementlasource.
Desexemplairessupplémentairespeuventêtreobtenusgratuitementàl’addressesuivante:Bureaudescommunications
Institutdetechnologiedel’informationConseilnationalderecherchesduCanadaOttawa(OntarioCanadaK1A0R6
ii

Ifsoftwarequalityisaperception,howdowemeasureit?
W.M.Gentleman,
NationalResearchCouncilofCanada,Ottawa,Canada,K1A0R6
Tel.(613993-3857,Fax.(613952-7151,gentleman@iit.nrc.ca
Abstract
Forovertwentyyears,metricshavebeenbeinginventedtomeasuresoftwarequality.Andyetquantifyingqualitypresupposesagreementonwhatconstitutesquality.Qualityhasbeenportrayedasanabsolutequantity,subjecttoobjectivemeasurements.Webelievethisefforthasbeenmisguided.Wearguethatquality,likebeauty,isintheeyeofthebeholder–thatis,thatqualityisnotabsolute,butdependsontheperspectivetakenbytheevaluator.Assuch,anydirectmeasureofqualitymustnecessarilybesubjective,summarizingtheimpressionsofsomeparticularclassofpeoplewhointeractwiththeproduct.Indirectmeasuresofqualityarelessobjectivethantheymayappeartobe–beyondthearbitrarinessofthechoiceofmeasure,andanydifficultyinitsinterpretation,thereisalwaysthetenuouslinkofthemetrictotheperceptionofqualitybyanyspecificgroup.Theneedforthisnovelpointofviewisespeciallyclearlyillustratedbymathematicalsoftware.
Keywords
Softwarequality,measurement,assessment,subjective,objective

NRCnumber:
40149
1

INTRODUCTION
Atarecentmeeting,twoeminentcomputerscientistswerediscussingthenumericalcomputingenvironmentMatlab.‘CouldMatlabbespace-worthy?’askedthefirst.‘Nevermindwhetheryoucouldflyitinaspaceraft’,respondedthesecond,‘wouldyouevendependontheresultsofitscalculationsinaspaceflight?’TheimpliedcriticismwasthatMatlabhadnotbeenbuiltthroughtheprocessofformalspecification,traceabilitytorequirements,formalreviews,andtestingagainstspecificationwhichistypicallydemandedofsafetycriticalsoftware.Mostnumericalanalystswoulddisagreewiththiscriticism.Asacommercialproduct,Matlabhasbeenimplementedbyknowledgeableandskilledexperts,basedonwellstudiedalgorithms.Beyondtestingbythedeveloper,andwidespreadgeneraluse,ithasbeenintenselyusedby,andindeedbeenthesubjectofresearchby,otherexpertsinthearea.Weaknesseshavebeenidentified,buttheconsensusseemstobethatthebaseproductissolid.Hastheconventionalprocessproducedsimilarendorsementfromtheusersofspacesoftware?Tociteafewincidents:theFebruary1991Patriotmissilefailureillustratesthatusersmayoverlookspecifiedlimitations;theMay1992failureoftheshuttleEndeavourtorecoverasatelliteautomaticallyillustratesthatthestandarddevelopmentprocessdoesnotprotecttheunwaryagainstwellknownanomaliesoffloatingpoint;andtheJune1996Ariane5failureillustratesshortcomingswithblindlybelievingspecifications.Inthepast,muchofthethinkingaboutqualityhasbeeninthecontextofone-on-onecustomer/contractorrelationship.Galsworthy(1912extolledthevirtuesofhandcraftedcustomproducts.Today’scustomeroftenfacesaverydifferentsituation:achoicebetweencompetitiveoff-the-shelfproducts.(Notethattheuserofnumericalsubroutinelibraries,orevenofpre-existingnumericalmethods,hasalwaysbeeninthissituation.Eachoftheseproductshasitsownspecification,andnotonlyarethesedifferent,notonlyhadtheindividualcustomernopartincreatingthatspecification,buttypicallythefullspecificationisnotaccessibletohim.Noneofthespecificationsislikelytomatchanyindividualcustomer’sneedsexactly,soelicitingdetailedrequirementsfromhimintheabsenceofknowledgeabouttheavailableproductsissimplyanexerciseinraisingfrustrationlevels.Forthesereasons,‘correctimplementation’,inthesensethattheproductsconformtotheirseparatespecifications,islikelytobeaminorissueforthecustomer.Fieldexperienceislikelytobeofmorevaluethanformalspecificationandverification.
Quantifyingsoftwarequalityisimportantbecause,apartfromaestheticappreciationofqualityproducts,ourpurposeinexaminingqualityistofacilitatedecisionmaking.Oneexampleofthesedecisionsisthechoiceofproducts,whereseveralwouldappeartodothejob.Anotherexampleisthedecisionofwhethertoaccept,andtopayfor,aproductthatclaimstomeetaparticularneed.Weareoftenconcernedwithqualityandprice,e.g.whatqualityisavailableforagivenprice,orhowmuchextrawouldbetterqualitycost.Consequentlyanotherexampleisindecidingwhatinvestmentisworthmakinginordertoimprovethequalityofagivenproduct.Inmanycaseswhatwereallywanttodoistopredictwhatourownlevelofsatisfactionwiththesoftwarewillbe,beforewehavehadthechancetoexercisethesoftwareextensivelyinaparticularcontext.Thiscomesupbothwhenthesoftwareisunfamiliartous,andwhenitisnotyetcomplete.The
2

classicalqualityassurancemotivation(e.g.IEEE,1988ofmonitoringtheproductionprocessisonlypartofthestory.
Animportantconsequenceofqualityisthatitengenderstrust.Theuserfeels‘IfthedevelopertookpropercareofthedetailsthatIrecognizetheneedfor,theprospectisgoodthathealsotookpropercareofthatwhichIwouldcareaboutifIhadthoughtofit.’Thedetailsonwhichqualityisassessedandtrustisestablishedmaynotevenbeonesrelevanttothisapplication.‘Forme,IwouldsaysoftwarequalityoccurswhenIperceivethattheproducerappearstohaveappropriatelyaddressedtheissuesthatconfrontmeasauser.Thismeansmostlyanabsenceofevidencethatheorsheisignorantofsomethingthatreallymatterstomebutalsopositiveassurancesthatmyconcernsarereflectedinthesystem’(Johnson,1996.Usersmustmakedecisionsaboutwordprocessorsandspreadsheets,forinstance,andtheyeasilyidentifyquality(or,moresignificantly,lackofqualityinsuchproducts,withouteverformalizingneeds.Intheworldofmathematicalsoftware,oneofthemostvaluablecharacteristicstomanyusersisexactlywhenthesoftwaretakescareoftroublesomebutraresituations,sothattheuserneednotevenbeawarethatsuchsituationsmightexist–andcertainlywouldnotbeabletoenumerateandcharacterizethem.
Fornumericalsoftware,manypeopleusedtothinkperformance(speed,intermediatestore,pagingbehaviour,etc.wasallthatmatteredforquality,butaswidelyavailablemachinesbecomemorecapable,thisviewisfadingasithasforothersoftware(Carrol,1984.
Incommonparlancequalityisoftenassociatedwithdurabilityandlastingvalue.Forsoftware,sinceitdoesnotwearout,thispertainstoongoingneedforthesoftwareandtoresilienceofthesoftwaretochangesintheenvironment.
Thedefinitionofthetermqualityisanissue.AninterestingdiscussionofthemeaningofqualitycanbefoundinKitchenham(1986.Asurprisingnumberofpeoplestillthinksoftwarequalityissimplytheabsenceoferrors.Dictionarydefinitionsaretoovaguetobeofmuchhelp.TheonlyrelevantdefinitionofferedbytheOxfordEnglishDictionary(Oxford,1993,forinstance,ispeculiarexcellenceorsuperiority.Noteworthyhereisthatqualitycannotbediscussedforsomethinginisolation:comparisonisintrinsic.Manysoftwareengineeringreferences(e.g.Gilbert,1983;Schach,1990;Hailstone,1991;Tinnirello,1995definesoftwarequalityascorrectimplementationofthespecification.Suchadefinitioncanbeusedduringproductdevelopment,butitisinadequateforfacilitatingcomparisonsbetweenproducts.Standardsorganizationshavetendedtorefertomeetingneedsorexpectations,e.g.theISOstandardISO8492:1986(ISO,1986definesqualityasthetotalityoffeaturesandcharacteristicsofaproductorservicethatbearsonitsabilitytosatisfystatedorimpliedneeds,adding(Note–Inacontractualenvironment,needsarespecified,whereasinotherenvironments,impliedneedsshouldbeidentifiedanddefined.IEEEStd610.12–1990(IEEE,1990definesqualityas(1Thedegreetowhichasystem,component,orprocessmeetsspecifiedrequirements.(2Thedegreetowhichasystem,component,orprocessmeetscustomeroruserneedsorexpectations.AnolderIEEEdefinition,IEEEStdP1061–1988(IEEE,1988definesSoftwarequalityisthedegreetowhichsoftwarepossessesadesiredcombinationofattributes.
3

Softwarequalityisoftendefinedintermsofthefitnessoftheproductforitspurpose.Howeverdifferentpeoplehavedifferentpurposesforthesamesoftware.Anovicecasualuserisprobablymoreconcernedabouteaseoflearning,andaboutrobustnessagainstmisuse,thanaboutefficiency.Asystemintegrator,planningtoincorporatethesoftwareinsomelargersystem,mightbemoreconcernedaboutfailuredetectionandrecoverythanabouteaseofinitialinstallation.Athirdpartymaintenanceorganizationisconcernedwithissuessuchasinternaldocumentationandadequacyofscaffolding(e.g.testharnesses,testgenerators,andinstrumentationthatgobeyondissuesofdirectconcerntotheusers.Theseshowthatsoftwarequalityisnotabsolute,butisaperceptiondependinguponforwhomthequalityisevaluated.Moreover,softwarequalityismultifaceted,andtheimportanceofthedifferentfacetschangeswiththecontext,evenforthesamepersonatdifferentpointsintime.
Considerthepurposesofmathematicalsoftwareproducts,suchasnumericallibraries(NAGorIMSL,anumericalcomputationandvisualizationenvironment(MATLAB,asymbolicmathematicssystem(MapleorMathematica,oraframeworkforcomputationonspecifickindsofproblems(anoceanmodel,abombcode.Foranyoftheseproducts,inadditiontothesupportactivitiesfortheproducttherearepeopleusingtheproductforattheveryleastthreedifferentpurposes:1productioncomputationofresultsneededinotherdisciplines,2teachingstudentsaboutthemathematics,3researchintodevelopingnewmathematicalmethods.Theneedsofthesegroupsareoftennotjustdifferent,butconflicting–whatonegroupwouldregardasqualityanothermayregardasmakingtheproductunusable.Theoptimizationsthatmakethecodefastenough,andtheintermediatestoragecompactenough,forproductioncomputationtobepracticalmaymeanthecodeistoocomplexforstudentstolearnfrom,andthatinsightspossiblefromintermediateresultsandauxiliarycalculationsarenotavailable.Themathematicalrigournecessarytoprovethatnewalgorithmsforsymboliccomputationhavetakenallpossibilitiesintoaccountmaybesoclumsyastomakethesystemuselessforanengineerdoingexploratoryderivationsofformulaethatwouldonlybeusefultoprovideinsightiftheresultsaresimpleenough.Flexibilityprovidedbyfacilitatingchangestothesourcecodemaybeanecessityforuserstryingtodocomputationsbeyondthemodelthataframeworkdirectlyprovides,butitisanightmareforsupportpersonnelrespondingtoproblemreports,whohavenoeasywayofrecognizingwhethertheproblemwascausedbyadefectintheproductorbyauserchangebreakingsomething.(Evenrestrictinguserextensibilitytoplug-inmodulesstillleavesthisproblembecauseaninadequateAPIspecificationcanleadtosubtlefailuresoftheplug-in.
QUALITYATTRIBUTES
Aparticularlyimportantdistinctionisbetweenwhatrepresentsqualityfortheuserandwhatrepresentsqualityforthesupplierofacommercialproduct.Listsofattributesthatqualitysoftwaremustaddresshavebeensuggestedforsometime(e.g.Gilb,1977.Acuriousaspectoftheselists,explicitintheISO/IEC9126standard(ISO/IEC,1991,isthattheytypicallyonlyconsiderattributesofdirectsignificancetotheuser.(Thisstandardpointsoutthatthereareseveralpotentialviewsofquality,includingtheuser’sview,thedeveloper’sview,andthe
4

manager’sview,butonlytheuser’sviewisinthecurrentversion,withotherspromisedinlaterrevisions.Seealso(ISO,1987e.Suchalist,forinstance,mightbe:
UserVisibleAspects
••••••
Appropriatefunctionality
CoexistenceandinteroperabilityEaseofuse
Lackofsurprises
Adequateandusabledocumentation
Easeofinstallationandupdate/cutover,includingdata.
Alistofattributesofimportancetoasupplierwouldincludethesebecausekeepinguserssatisfiedisessential,butwouldgomuchfurther:
SupplierVisibleAspects
••••••••••
EaseoflearningformaintainersEaseofadaptability
Structurefortimelinessandcost-effectiveimplementationAdequacyofexceptionhandling
TestabilityandmeasurabilityofproductAnalyzabilityandpredictabilityofproductAdequacyofscaffoldingandsupporttoolsProfessionalismofprogrammingEfficiencyandperformance
Abilitytoconvincethirdpartiesofcorrectness,conformance,etc.?
MEASUREMENTOFPERCEPTIONS
Thedirectapproachtomeasuringqualitythenistostudytheperceptionsthatothershaveformedofthesoftware,andextrapolatethattooursituation.Acentralissuetoaddressiswhoseopinionswewant.Inpromotingsubjectiveassessmentofsoftwareproducts,weareinnosensesuggestingthatevaluationofsoftwarequalityshouldbelefttotheintuitionofthedevelopers.Oneobviouspossibilityisthatofexpertsinthearea.Thisis,effectively,whattherefereeingprocessofjournalsprovides,althoughthefactthatrefereesreportsarenotmadepublicdiminishesthebenefitthirdpartiescangainfromthem.Mostjournalswouldwelcomepapersthatarecritiquesandcomparisons,butthesearequiterare.CommercialpublisherssuchasOvumproducereportslikethisonpopulartopics,buttheytypicallyhavelimitedaccessibilityduetoprice.Anadvantageofexpertassessmentisthatexpertsarecompetenttoknowwhatarethepotentialstrengthsandweaknessestolookfor,andhowtostudythem.Thedisadvantageisthat
5

byknowingtoomuch,theexpertmaynotrecognizeobstaclesthatwouldimpedetheuseofthesoftwarebynovices,orbyuserscomingfromotherdisciplines.
Anotherpossiblesourceofopinionsisjournalisticreviewers,suchasthosepublishedinthecomputerpress.Anadvantageofthisgroupisthattheyhaveprofessionalincentivetodomanyreviews,andhencehaveabroadcontextagainstwhichtocompareproducts.Theyalsobecomeadeptatexplainingtheirimpressionsofaproducttoalargelynontechnicalaudience.Thishowevercanalsorepresentadisadvantageifareaderhasadeeperunderstandingoftheareathantherevieweraddresses.Anotherdisadvantageisthatjournalisticreviewerscanbebiasedbytheirpersonalneedsandexperience,whichareprimarilyjournalisticnottechnical.
Yetanotherpossibility,madepracticallargelybythenewsgroupsontheInterNet,istoassimilatetheexperiencesof‘userslikeme.’Newsgroupssuchascomp.soft-sys.matlab,comp.soft-sys.math.mathematica,andsci.math.symboliccontainmanyitemsthatprovideinsightintousers’impressionsofthatparticularsoftwareproduct:problemsusershaveinappreciatinghowtousethesoftware,creativewaysthesoftwarecanbeusedtoperformcomplextasks,desirableenhancements,etc.Itemswhereaquestionposeddrawsresponsesfromoneormoreotherusers,orevenfromsupplier’srepresentatives,areparticularlyinterestingtothirdparties.Distillingqualityevaluationsfromnewsgroupsisarduous,however,becauseofthefloodofitems,becausetheinformationisnotpresentedinaformsuitableforautomaticprocessing,andbecausethecontributorsdonotrepresentarandomsample(andindeeddonotexplicitlycharacterizethemselvesastobackground,sophisticationetc..Thesituationcouldbeimprovedif,ratherthananewsgroup,aWWWwebsitewasusedtocollectsuchitemsinadatabase.AlthoughHTMListooweaktofacilitatemathematicalnotationdirectly,diagramsandevenmathematicalnotationcanberenderedbyadroituseofGIF,whichformathematicalsoftwarewouldbeofrealbenefitoverjustusingplainASCIItext.Whileavendor’ssummaryofsuchuserfeedbackmightbedismissed,therawmaterialbeingavailablemeansothers,suchasausergrouporevenanindividualpotentialuser,canprovidetheirownanalysis.Analysisofsuchdatahasmuchincommonwithretrospectivestatisticalsurveys,andwhilesuchstudiesdonotenjoytheopportunitiesthatprospectivesurveyshavetouserandomizationtoeliminatebias,thereisasubstantialliteratureonhowtodetectandameliorate,ifnotcorrectfor,itseffects(Cochran,1963;Stephan,1958;Clark,1991.Vendorsmightinitiallybesensitivetonegativeimagepossiblefromsomeofthepostings,butthenewsgroupscontainthosenow,andonthewholeareofnetbenefittotheproducts.
Wearenotsuggestingthattraditional‘objective’metricsbeabandoned,butonlythattheyshouldbeappreciatedinadifferentlight.Iftheactualintentistomeasureperception,measurement,howeverobjective,ofattributesofthesoftwarethatwesuspectmightinfluenceourperception,isanindirectapproach.Relatingthesemeasuredattributestomeasuredusersatisfactionissurprisinglyrare(Buckley,1995.Thisindirectapproachappearstohavetheattractionofbeingmorequantitativeandprecisethanthediscursivepresentationwithcheckliststhattypifythedirectapproach.Italsoappearstobelessinfluencedbyindividualintuitionandtaste.Thereare,however,deeperconsiderationsthatneedtobetakenintoaccount:
6

•••••••••••
Thesetofattributestoconsiderisproblematical.
Theappearanceofobjectivityissomewhatmisleading,inthatmanyoftheattributesareinfactqualitative,notquantitative.
Evennotionallyquantitativeattributessuchasportabilityarenotsoinpractice.
Forsomeattributes,thearbitrarinessofanymetricmeansnumericalscorescandistortthepictureratherthanrefineit.
Theoremsprovedaboutthesoftwareobviouslyincreaseourunderstandingofit,butmaybeoflimitedapplicabilityunlesstherelevantconditionsofthetheoremcanbereadilyestablishedorthetheoremcanbeshowntoberobust.
Ostensiblyreproduciblecomputationalexperimentscanbecarriedoutbythevendor,byanindependenttestinglaboratory,orevenbythepotentialuser,tostudywelldefinedattributessuchasaccuracy,storagerequirementsorspeed,yetbatteriesoftestshavesimilarprovisos.Thereisalsothequestionastowhetherproblemsstudiedbycomputationalexperimentsshouldberepresentativeofproblemsinthearea,orshouldbeillustrativeofspecificstrengths(orweaknessesofthesoftware.
Experimentalassessmentofrepresentativetasks,suchasextendingthealgorithmorintegratingthesoftwareintoalargersystem,requirescontrolonsomanyfactorsthattheresultsareusuallybestunderstoodasanecdotal.
Someattributesareintrinsicallydifficulttoobserve,whichoftenleadstoinsteadstudyingsurrogatesthathopefullyexhibitsimilarbehaviour.
Atrapcompetitivedevelopersoftenfallintoismistakingascoreonthesurrogatefortherealobjective.
Interestingly,whenourrealpurposeistopredictperception,itmaynotbenecessaryfortheobservedmeasurementtohavearecognizedcausalrelationshipwithqualityatall,providedthatthereisanobservedstatisticalcorrelation.
Anevenmoreindirectapproach,currentlyinvogue,istostudytheprocessbywhichthe

《how do we measure it.doc》

将本文的Word文档下载到电脑，方便收藏和打印

推荐度：

点击下载文档

文档为doc格式

相

关

案

例

how do we measure it

相关推荐

推荐内容