background image

Summary of Contents for Voice Master

Page 1: ...TER I I a n d IIe Includes SPEECH RECORDING AND PLAYBACK SPEECH WORD RECOGNITION APPLICATION EXAMPLES ON DISK PROGRAM LIST EXAMPLES WITH VOICE CONTROL OF EXTERNAL SWITCHES WITH AMPLITUDE EDITOR Copyright 1986 1987 COVOX Inc 675 Conger Street Eugene Oregon 97402 F i r s t Printing November 1986 Second Printing August 1987 ...

Page 2: ...BERS 40 7 CALIBRATE AND GAIN CONSIDERATIONS 40 QUICK REFERENCE FOR CABLE CONNECTIONS The main captive cable from your Voice Master plugs into the joystick port For Apple II an optional joy stick adapter is needed The headset has two mini stereo type jacks on the end of one cable The red one goes to MIKE the black one to EAR if used both located next to each other on the Voice Master unit That1s it...

Page 3: ...ognition remains an unreliable technology due to uncontrollable variations in the way that normal speech is produced in an uncertain and noisy acoustic environment Covox Inc specifically disclaims liability as stated in the preceding paragraph when applied to word recognition PATENTS AND COPYRIGHTS The software supplied with VOICE MASTER is copyrighted It may not be copied reproduced translated or...

Page 4: ... software related to those described in this manual Software relating to speech on the Voice Master disk is very extensive In fact it is so extensive that we were forced to put music software on the reverse side of the disk It can be loaded directly from the reverse side with BLOAD or you can follow instructions on MENUM from the speech side of the disk The Voice Master disk contains essential uti...

Page 5: ... is present It is t h e r e s p o n s i b i l i t y of t h e user t o i n s t a l l t h e c o r r e c t software The word recognition function is independent of Sound Master Voice Master software u t i l i z e s DOS 3 3 There is one playback only program t h a t can function with ProDOS Conversion of t h i s p a r t i c u l a r program t o RoDOS form can be accomplished w i t h t h e conversion r ...

Page 6: ... h a t t h e r e may be another vocabulary f o r t h e same words but i n a d i f f e r e n t language Note t h e ampersand ll ll Voice Master commands have been ltwedgedWi n t o Applesoft BASIC and a l l such commands begin with t h i s symbol A Pre recorded vocabulary is loaded i n t o t h e lower 64K memory bank i f t h e version of Voice Master software t h a t you choose t o employ is f o r a...

Page 7: ...beep This i n d i c a t e s t h a t a word f o r t h a t index number was not recorded The range o f indices is 0 63 and playback can be i n any order Now type SPEED 4 SPEAK 5 and you w i l l hear fivew slowed down The sampling rate during playback has been slowed The range o f SPEED values is 0 10 and 6 is t h e d e f a u l t value which exists i n t h e absence o f a s p e c i f i c SPEED comman...

Page 8: ...upper 8 b i t s of t h e 16 b i t address There a r e a t o t a l of 256 pages of memory i n t h e lower bank of memory 256 256 65536 bytes and another 256 pages i n t h e upper bank f o r Apple I I c and memory augmented I I e Memory augmented versions of Apple 11 beyond 64K may not perform properly with Voice Master The command defines t h e location of a vocabulary when t h e vocabulary is o r ...

Page 9: ... PRINT D BLOAD PARTBvv 70 CALL 35072 When running a BASIC program you can s t o p t h e progr am with t h e CONTROL C key a t any time and change playback c h a r a c t e r i s t i c s such a s SPEED or VOLUME with keyboard commands o r equivalent POKEvs t o memory l o c a t i o n s a s discussed an Appendix Then type C O N T t o continue When playback is i n progress you can press t h e space bar...

Page 10: ...n proper operation of a voice operated switch sometimes referred t o a s llVOXw A command t o record should not normally cause recording t o s t a r t u n t i l a reasonably loud s i g n a l is measured And when t h e speech sample ends a s h o r t period of low amplitude l e v e l s i n d i c a t e s t h a t t h e recording process should end An Appendix presents a more d e t a i l e d explanatio...

Page 11: ...e checked occasionally i n case inadvertent j a r r i n g temperature e f f e c t s o r aging have changed t h e e f f e c t i v e s e t t i n g There a r e two d i f f e r e n t ways t o c a l i b r a t e one with a machine language program c a l l e d BARw and another with a wedged in command CALIB One of t h e options on MENU is CALIBRATION which s e l e c t s t h e wedged command BARw can be l...

Page 12: ... second method f o r c a l i b r a t i n g r e q u i r e s t h a t one of t h e Voice Master programs with wedges be i n main memory Then use t h e s p e c i a l wedged in command When t h i s command is issued t h e question mark i n t h e lower r i g h t corner appears a s i n normal recording But recording never t a k e s place Proper c a l i b r a t i o n has t h e question mark motionless i n...

Page 13: ... made c a b l e can connect t h e Voice Master t o t h e audio l e a d t h a t normally goes t o t h e i n t e r n a l speaker Of c o u r s e a s e p a r a t e audio power a m p l i f i e r o r telephone connection can be adapted t o s u i t s p e c i a l needs RECORDING With e s s e n t i a l Voice Master software i n s t a l l e d have t h e microphone ready and type Upon p r e s s i n g RETURN ...

Page 14: ... h e RESET v a l u e w i l l a p p l y t o t h e page number i n t h e upper bank o f 64K But t h i s same vocabulary is loaded i n t o t h e lower 64K bank i f Voice Master s o f t w a r e is f o r a 64K system A s d e s c r i b e d b r i e f l y i n s e c t i o n s on playback a r e c o r d i n g w i t h t h e showing i n t h e lower r i g h t c o r n e r can be p u t on hold w i t h Control A 6...

Page 15: ...d Master h a s no bearing on t h e n a t u r e o f t h e speech i n i t i a l l y presented f o r e d i t i n g E d i t i n g with Sound Master The use o f t h e EDITORn w i l l be discussed first f o r t h e c a s e when t h e Sound Master is i n s t a l l e d Then t h e s p e c i a l methods and techniques which can improve speech without t h e presence o f Sound Master can be explained The p r ...

Page 16: ...nge t h e 15 bytes of f a s t data F i r s t is t h e BW key This is a f a s t way t o zero an amplitude Whereas amplitudes set t o zero a s previously described can be recovered t h e method with t h e Bvkey zeros t h e f a s t bytes i n a way t h a t cannot be cancelled once you leave t h e e d i t mode Another s p e c i a l e d i t option is t h e XI key This removes every fourth p o s i t i v ...

Page 17: ...r y reducing amplitudes following t h e end o f t h e voiced sound s o a s t o enhance t h e sudden amplitude drop The word might be a l i t t l e e a s i e r t o understand F r i c a t i v e s such a s fW and thn a l s o can be improved by reducing amplitudes and or with Sn and ZV keys But you can do more Try changing f f s i x wt o t i c k s m by p u t t i n g a z e r o amplitude gap j u s t bef...

Page 18: ...AISE AMPLITUDE VALUE MOVE CURSOR LEFT M O V E CURSOR RIGHT L O W E R AMPLITUDE VALUE PLAY T O CURSOR PLAY ENTIRE WORD QUIT T O EDITOR M E N U RESTORE A T CURSOR SILENCE A SIBILANT REMOVE EVERY 4TH CYCLE L O W PASS A T CURSOR SCROLL LEFT SCROLL RIGHT L O W E R AMPLITUDE M RAISE AMPLITUDE I M E N U LEARN A W O R D SPEAK A W O R D EDIT A WORD CHANGE WORD N U M B E R LOAD A SPEECH FILE SAVE A SPEECH F...

Page 19: ...h e unknown A closeness score can be computed a s t h e sum of t h e differences i n magnitudes or root mean square magnitudes Certain weightings might be applied t o t h e patterns according t o r e l a t i v e importance of t h e i r various parts The lowest score then indicates t h e best estimate f o r t h e unknown A large lowest score indicates no good match Two o r more low scores indicate ...

Page 20: ...e c t i v e when templates o u t s i d e t h e sub group o f i n t e r e s t have been TRAIN ed What happens when you RECOG The index number o f t h e b e s t match is p u t i n t o memory l o c a t i o n 25 i n page zero If t h e b e s t match was f o r example f o r word index number 3 then t h e decimal number 3 w i l l appear on t h e screen with PRINT PEEK 25 What i f you g e t no good match ...

Page 21: ... PEEK 25 IF A 253 T H A N 110 200 BLANK 11 BLANK 12 210 RECOG 1 2 220 UNBLANK 230 AzPEEK 25 240 IF A 250 THEN 400 In t h i s example note t h a t a request t o repeat t h e recognition is made i f the MIN MAX e r r o r occurs error numbers 254 and 255 in Loc 25 a s shown i n t h e t a b l e The second recognition jumps elsewhere i f a time out occurs The nature of t h e number i n location 25 can ...

Page 22: ...5 a s an e r r o r condition and t h e word j u s t entered is not averaged I f you a r e writing an o r i g i n a l program you might want t o prompt the user t o re TRAIN o r BLANK and then re TRAIN I f you re TRAIN and continue t o g e t beeps perhaps your o r i g i n a l word is a t f a u l t and you should s t a r t over again Error C r i t e r i a Thresholds and Hints Two kinds of e r r o r ...

Page 23: ...ous amplitude samples with non zero values The ruling number is i n location 35088 nominal value 12 Another parameter determines how 10 ng a f t e r t h e end of a word t h e computer must wait i n order t o decide when t h e word has in f a c t ended This involves a count of contiguous amplitudes having zero values The parameter is in location 35089 nominal value 12 Additional data on memory loca...

Page 24: ...o g n i t i o n Thus l i m i t t h e number o f repeated TRAIN s t o perhaps 2 Some words can b e n e f i t with more avaraging than o t h e r s Be aware o f how you r e l e a s e f i n a l p l o s i v e s l i k e tnand pW I t i s o f t e n o p t i o n a l i n o r d i n a r y speech t o r e l e a s e such a p l o s i v e o r n o t t o r e l e a s e it Consider f o r example t h e f i n a l tW i n ...

Page 25: ...ounds occur DEMONSTRATION PROGRAMS O N DISK Note The vocabularies used with t h e demonstration programs have been amplitude edited Some have i n addition been edited with Speech Construction Setw The q u a l i t y is thus l i k e l y t o be somewhat b e t t e r than can be r e a l i z e d with d i r e c t l y recorded vocabularies which have not been edited Put t h e Voice Master disk in t h e d ...

Page 26: ...verage a s discussed elsewhere The llS1l kqr f o r SPECTRUM DISPLAYv g e t s BARq1 a s discussed i n t h e s e c t i o n on CALIBRATION AND MICROPHONE TECHNIQUEn You can experiment with t h i s d i s p l a y t o see how p a t t e r n s change with your speech Information contained i n t h e b a r p a t t e r n s is used i n p a r t f o r word r e c o g n i t i o n The f u r t h e s t r i g h t bar...

Page 27: ... cards You then s a y standw o r h i t men The i d e a is t o g e t a s c l o s e t o 21 a s you can without going over Aces can count a s one o r eleven After your f i n a l stand t h e d e a l e r s a y s e i t h e r t h a t you won o r he won o r a draw and your accumulated c a p i t a l is updated Say llcardsw and see what c a r d s have been played f o r card countingn p r a c t i c e Say car...

Page 28: ...ge 0 255 This program w i l l be even more p r a c t i c a l i f we s e t it up t o read d a t a from some d i f f e r e n t program perhaps one you got from a magazine l i s t i n g and t h a t you want t o check f o r accuracy by l i s t e n i n g t o t h e spoken numbers a s you follow along t h e printed l i s t i n g with your eyes W e w i l l presume t h a t t h e program t o be checked incl...

Page 29: ... r mix of integers The checking program ends when a negative f i n a l D A T A statement appears This is guaranteed t o occur a f t e r a l l other D A T A statements and thus provides a positive ending command The reader w i l l recognize t h a t t h i s program could be written with fewer l i n e s by using colons t o put two o r more statements on a l i n e A second and more e f f i c i e n t p...

Page 30: ... be a l i t t l e slow e s p e c i a l l y f o r c h a r a c t e r s near t h e end of t h e s t r i n g W e could speed it up somewhat by c r e a t i n g a vocabulary with the more frequently appearing l e t t e r s of t h e alphabet i n t h e f i r s t p a r t of the s t r i n g a s doneshere with space which is t h e most frequently seen character of them a l l Another method uses the designate...

Page 31: ... t o speak t h e number i n Spanish The following example uses t h e English fivew t o end t h e program and various e r r o r conditions c a l l f o r a correctnd input 10 BLOAD PART AX 15 BLOAD PART B X 20 CALL 35072 30 TFINDnNBRS 40 FINDnSNRS 50 RECOG 1 60 A PEEK 25 70 I F A 5 THEN 50 80 I F A 5 THEN 110 90 SPEAK A PAUSE 2 100 GOT0 50 110 END EXTERNAL SENSING AND CONTROL Apple 11 and IIe comput...

Page 32: ...e appropriate W e w i l l be concerned with t h r e e input output p o r t s which a r e a v a i l a b l e on a l l Apple 11 and IIe computers only some of which a l s o a r e a v a i l a b l e on Apple I I c S p e c i f i c a l l y we consider t h e joy stick paddle port with 9 pins on t h e connector requires a s p e c i a l adapter cable f o r Apple I I a s well a s t h e expanded version o f t...

Page 33: ...have no control over which one comes f i r s t only t h a t you can a l t e r n a t e between t h e two A BASIC program with a loop t o c r e a t e t h e square wave follows 10 INPUT N 20 PEEK 49200 30 FOR J 1 T O N NEXT J 40 GOT0 20 The frequency is set by specifying t h e number N a t t h e beginning A n on off switch can be created i f t h e switched device can discriminate between a repeating ...

Page 34: ...hat you remove the cable going t o the 9 pin connector i f it e x i s t s or else make a cable that taps into the 16 pin connector while also allowing the cable t o the 9 pin connector t o be attached Each switch associates with two memory locations B y referencing the f i r s t location with a memory read command PEEK the annunciator line is turned off voltage low B y referencing the second memor...

Page 35: ...s i n use have a disk d r i v e which is coupled via a s e r i a l port But many a s i n classrooms do not have a card t o operate a p r i n t e r A reasonably good p r i n t e r c o s t s more than t h e computer it serves A s an aside The ambitious machine language programmer should be a b l e t o c r e a t e a standard s e r i a l output l i n e using any one of t h e four annunciator b i t s o...

Page 36: ...ts how a switch can be implemented Examples with a physical switch show how to make a closure give either a low voltage or a high voltage K I n 2 q J n 0 Open high 1 0 A PEEK 49249 20 IF A 127 THEN SFEAKO REM A IS ON 30 B PEEK 49250 40 IF B 127 THEN SPEAKI REM B IS ON 50 C PEEK 49251 60 IF C 127 THEN SPEAK2 REM C IS ON 70 GOT0 1 0 This program reports on closed switches With another set of IF THEN...

Page 37: ...n value is acquired One can o f c o u r s e use t h e value o f PDL J t o form a voiced message from a s t o r e d vocabulary You could have temperature wind v e l o c i t y wind d i r e c t i o n and humidity a l l spoken o u t a s measured somewhere else w i t h comparatively simple potentiometers APPENDICES 1 COMMAND SUMMARY Recording and Playback SPEAK n Designated word o r phrase i n t h e ra...

Page 38: ... word n f o r n i n t h e range 0 31 BLANK without a number c l e a r s a l l templates When re training a word BLANK f i r s t i n order t o avoid averaging LUNBLANK n Recovers t h e template previously BLANK ed o r a l l templates if index number n not used RECOG n Program waits f o r an input and attempts recognition by comparing t h e template made f o r t h e input word t o those i n memory I...

Page 39: ...back a s well a s for d i s k storage and retrieval make frequent use of u t i l i t y programs in the Applesoft ROM 3 IMPORTANT MEMORY LOCATIONS Many of the memory locations in t h i s section refer t o a BASEw address which is defined in the RESET statement used in conjunction with making and saving a vocabulary The BASE address is stored in memory location 35076 Memory location 25 19 hex Curren...

Page 40: ...e the low and high order bytes respectively for the ending address for phrase 7 Note PEEK w i l l not work i f you use the 128K version because speech resides i n the upper 64K of memory which cannot be PEEK ed or POKE d from BASIC Memory Locations BASE 256 and BASE 257 These two memory locations define the current top of speech memory Memory location BASE 259 Total number of recorded phrases Rang...

Page 41: ...LE values a r e above t h e d e f a u l t value but a t t h e c o s t o f a d d i t i o n a l memory f o r s t o r a g e The beginning b y t e o f a vocabulary c o n s i s t i n g o f one o r more words up t o a t o t a l of 64 words is a t BASE 331 The s t a r t i n g address can be displayed f o r 64K v e r s i o n s only a s PRINT PEEK 256 n 331 where n is t h e page number used i n RESET d e f...

Page 42: ... t s e t t o one i n t o a special memory location and then c a l l i n g t h e load addresses For example assume you want t o load a f i l e called ENGLISHw The following s t e p s w i l l accomplish t h i s 10 A ENGLISH1 20 F O R W l T O L E N A 30 POKE 38272 W 1 ASC MID A W l 128 40 NEXT W 50 POKE 38272 W 1 141 R E M REQUIRED ENDING BYTE 60 CALL 38150 R E M L O A D FILE In order t o play back a...

Page 43: ...n addition you must convert your speech f i l e i n t o RoDOS format The following i n s t r u c t i o n s show you how t o load t h e speech f i l e ENGLISHn assuming your ProDOS prefix is c a l l e d USERS DISK 10 A USERS DISK ENGLISH 20 POKE 38080 LEN A REM SET LENGTH OF FILENAME 30 FOR W 1 T O LEN A 40 POKE 38080 W ASC MID A W l 128 50 NEXT W 60 CALL 37894 REM LOAD SPEECH FILE The following pr...

Page 44: ... spoken word t h a t w i l l be accepted For example Figure 3 shows a s h o r t click like sound t h a t w i l l be r e j e c t e d i f t h e length o f t h e word between threshold p o i n t s A and B is less than TI The purpose o f t h e MAD count is t o prevent s h o r t b u r s t s o f noise from being considered as possible speech candidates You can change t h i s value with a POKE t o l o c ...

Page 45: ... o n o f p o r t i o n s o f t h e word e g t h e e i n equalsn o r for e x c e s s i v e n o i s e o r s i l e n c e gaps a t t h e beginning o r end o f t h e word A more a c c u r a t e means is t o u s e t h e EDITORw program t o v i s u a l l y i n s p e c t t h e endpoints C A I ibra t e AMPLITUDE Settinq correct word Length Word Too Long FIG 1 Effects of Calibration Setting AMPLITUDE ain To...

Page 46: ...covox NC 675 D Conger street Eugene Oregon 97402 U S A Area Code 503 342 1271 Telex 706017 Av Alarm UD ...

Reviews: