Device for recognizing speech commands under effect from noises

FIELD: radio engineering.

SUBSTANCE: device has block for determining beginning and end of command, first memory block, block for syllable segmentation, block for time normalization of command, standard commands block, commands likeness calculator, while output of block for determining beginning and end of command is connected to first inputs of first memory block and syllable segmentation block, output of first memory block is connected to first output of command time normalization block, second output of which is connected to output of syllable segmentation block. Device additionally has supporting noise input, second memory block, block for time normalization of noise, first and second blocks for level normalization, signals mixer, while input of speech command is connected to output of block for determining beginning and end of command and to second inputs of syllable segmentation block and first memory block, bearing noise input is connected to first input of second memory block, to second input of which output of block for determining beginning and end of command is connected, output of second memory block is connected to first input of block for time normalization of noise, output of syllable segmentation block is connected to second inputs of block for time normalization of noise, of first and second level normalization block, standard commands block, and to third inputs of first and second memory block, output of block for time normalization of noise is connected to first input of signals mixer, to second input of which output of standard commands block is connected, first input of which is connected to first output of commands likeness calculator, output of signals mixer is connected to first input of second level normalization block, output of which is connected to second input of commands likeness calculator, to first input of which output of first level normalization block is connected, first input of which is connected to output of block for time normalization of command.

EFFECT: higher probability of correct command recognition during effect from noises.

6 dwg

 

The invention relates to information processing systems and management, in particular to a device for the recognition of speech commands in terms of noise, and can be used to build systems voice control of the vehicle, working in conditions of acoustic noise, for example, in the cab of the vehicle, aircraft or on Board ship.

Known systems of speech recognition commands in terms of noise (see, for example, U.S. patent No. 6529866 B1, publ. 04.03.2003), based on the determination of the parameters of a hidden Markov model (hmm) speech commands and containing, as the proposed device, the input speech command and the block defining the beginning and the end of the command. In addition, known systems contain blocks allocation and storage parameters of an hmm speech units and the power correlation parameters famous speech units to certain parameters. The output of the block selection start and end commands is connected to the input unit selection parameters of an hmm speech units, the output of which is connected to the first input of the block matching parameters are known speech units to certain parameters, to the second input of which is connected the output of block storage parameters of an hmm speech units. To reduce the influence of background noise on the quality of recognition in these systems are either frequency filters connected to the input of the training is non speech signal, or to block a decision about belonging famous speech units signal connects adjustment unit, which stores information about the statistical differences between noisy and clean fragments of speech (additional Markov model, trained on noisy speech signals). However, uncertainty in the amount of training samples when building blocks, containing the parameters of the hmm, entails an excessive increase used in the memory device. A disadvantage of devices based on the units of selection parameters SMM, is the instability of their work in a changing noise with complex acoustic structure as stored in the device memory SMM may not contain the parameters of all possible States of the noise.

It is also known a device for the recognition of speech commands in terms of noise for U.S. patent No. 6678656 B2, publ. 13.01.2004 containing, as the proposed device, the input speech command, and, in addition, filters high and low frequency, the block noise filtering unit subtracting the reference noise of the high frequency component signal, the adder filtered from the noise of the low frequency speech component and a high frequency speech component. To the input speech command, the known devices are connected filters high and low frequency. The output of the filter low frequency connected to the input of block noise filtering, which has in turn two outputs: one connected unit subtracting the reference noise of the high frequency component signal, and the second adder filtered from the noise of the low frequency speech component and a high frequency speech component. The output of high-pass filter connected to the input of the subtraction of the reference noise, the output of which is connected to the input of the adder high-frequency and low-frequency component of the signal. Device filters, separating the speech signal, allows filtering of the high frequency component signal from interfering noise on the reference noise is filtered by the low-frequency component. A disadvantage of the known device is a high probability of suppression of the information component of the signal by the noise filter circuit is included in the low-frequency component of the signal. In addition, the error of the noise filter low-frequency component entails an error of subtracting the reference noise from the high-frequency component, which reduces the likelihood of a correct identification of the speech signal recognition system as a whole.

Closest to the present invention is a device for the recognition of speech commands for U.S. patent No. 4979212, class G 10 L 7/08, publ. 18.12.1990, prototype, containing, as the proposed device, the input speech is the commands themselves, the block defining the beginning and the end of the command, the first memory block, the block segmentation into syllables, the power regulation command time, a unit of standard commands and the similarity evaluator commands, and the output of the block defining the beginning and the end of the command is connected with the first inputs of the first memory block and the block segmentation into syllables, the output of the first memory block connected to the first input unit rationing team time, a second input connected to the output of the block segmentation into syllables.

In contrast, we offer the known device also includes a unit Converter input speech signal into its frequency interpretation and decision-making. The input unit of the input speech signal into its frequency interpretation is connected to the input speech command, and the output to the first input of the block defining the beginning and the end of the command and the input of the memory block. In addition, the output unit rationing team time connected to the first input of the transmitter similarity of commands to the second input of which is connected to the reference block of commands, and the output of the transmitter of similarity commands the unit is connected to a decision. The disadvantage of the prototype is a low probability of correct recognition of speech commands in terms of noise with complex acoustic structure, since the reference block of commands contains commands is either without background noise, or with background noise, different from the background of the identified commands, in addition, the block-Converter input selects the component of the spectrum of speech signal masked by the spectrum of the noise component of the signal, which leads to diversity of signals, commands and standards and, accordingly, errors in the operation of the transmitter of similarity commands.

The need to use speech commands occurs, for example, while driving, vehicle or any other vehicle to reduce the load on the organs of perception and communication tools of the driver or pilot. This can be used device is a voice answering a voice request about the status of the onboard equipment or physical characteristics of the vehicle. These devices are based on the recognition of speech commands in conditions of intense acoustic noise. The probability of correct detection commands to a large extent depends on the characteristics and intensity of the noise environment in which pronounced the team. The reason is that the recognition is based on the similarity of characteristics of the spoken command with the characteristics of the reference commands, and the command will be treated as the signal is split into N time steps, and its characteristics are, for example, C is achene normalized signal amplitude, some with some increments on each of the N segments. If each segment is defined By characteristics, the similarity can serve as a value

whereand- values i-x of the N characteristics of the two teams. From (1) we see that ifwhen i=1...N, k=1...K, that ρ=0, and the larger the absolute difference of the corresponding values of the characteristics, the greater the value ρ. The smaller the value ρ for two teams, the more similar commands. The team will be considered identical to the benchmark, which is ρ minimally. Used reference commands uttered in the absence of noise, and noise in the team delivered for recognition, making the error in its characteristics, which leads to an increase ρ for the spoken and the corresponding reference commands. This leads to a decrease of the probability of correct recognition. In cases where the quality of the recognition system depends on the safety of equipment or person, there is a need to maximize the likelihood of a correct understanding of the team that places high demands on the accuracy of the device to recognize speech commands in terms of noise.

To achieve a technical result increases the probability of correct recognition of the team in terms of noise - in the known device, the speech recognition command containing the input speech command, the first memory block, the block segmentation into syllables, the power regulation command time, a unit of standard commands, the transmitter similarity commands and block defining the beginning and the end of the command, the output of which is connected with the first inputs of the block segmentation into syllables and the first memory block, the output of which is connected to the first input unit rationing team time, a second input connected to the output of the block segmentation into syllables, optionally enter the reference input noise, the second memory block, the unit of measurement noise on time, the first and second blocks regulation level and the mixer signal. If the input voice command is connected to the input of the block defining the beginning and the end of the command with the second inputs of the block segmentation into syllables and the first memory block. The reference input noise connected to the first input of the second memory block, the second input of which is connected to the output of the block defining the beginning and the end of the command. The output of the second memory block is connected to the first input unit of measurement noise at time, the output of block segmentation into syllables connected to the second inputs of the unit of measurement noise on time, the first block of rationing level, the second unit of regulation on the level of the reference block teams and third inputs of the first and second the memory locks. The output unit of measurement noise at time connected to the first input of the mixer signals to the second input of which is connected to the output of the reference block of commands, the first input of which is connected to the first output of the transmitter similarity teams. The mixer output signals are connected to the first input of the second block of rationing level, the output of which is connected to a second input of the transmitter similarity commands to the first input of which is connected to the output of the first block of rationing level, the first input connected to the output of the power rating of the team over time. The second output of the transmitter affinity command is the output of speech recognition commands in terms of noise.

The introduction of the reference input noise and mixer signals is provided by forming the reference commands artificial background noise close to the backdrop of recognizable commands. This achieves a greater similarity of recognized commands with background noise and the corresponding reference (initially containing no noise background), which increases the probability of correct recognition of the team in terms of noise without the use of additional noise filtering algorithms. The second block of memory is used to store the reference noise, the unit of measurement noise at time is used to bring the duration of the reference noise for a long time the ti reference commands, the first and second blocks of rationing level are used to normalize the amplitudes of the compared command.

Conducted by the applicant's analysis of the prior art, including searching by the patent and scientific and technical information sources, and identify sources that contain information about the equivalents of the claimed invention, has allowed to establish that the applicant had not discovered similar, characterized by signs, identical with all the essential features of the claimed invention. Select from a list of identified unique prototype, as the most similar in essential features analogue, has identified a set of essential towards perceived by the applicant to the technical result of the distinctive features in the claimed device, set forth in the claims. Therefore, the claimed invention meets the criterion of "novelty".

To check compliance with the claimed invention, the criterion of "inventive step", the applicant conducted an additional search of the known solutions to identify signs that match the distinctive features of the prototype of the characteristics of the claimed device. The search results showed that the claimed invention not apparent to the expert in the obvious way from the prior art, as defined by the applicant. Not identified the impact of changes before is considered to be the essential features of the claimed invention, to achieve a technical result. In particular, the claimed invention does not provide the following transformations: addition of known means of any known part attached to it according to certain rules, to achieve a technical result, in respect of which it is the effect of such additions; the replacement of any part of the other known means known part to achieve a technical result, in respect of which it is the effect of such a change; the exclusion of any part of the funds while the exclusion of its functions and the achievement of a result of such exclusion; an increase of similar elements to enhance the technical result due to the presence in the vehicle is such elements; the execution of a known drug or part of a known material to achieve a technical result due to the known properties of the material; the creation of tools, consisting of well-known parts, the choice of which and the relationship between them is carried out on the basis of known rules, recommendations and achievable technical result is due only to the known properties of the parts of this object and the relationships between them; change quantitative attributes or relations to recognize the s, if you know the fact of the influence of each on the technical result and the new values of the signs or their relationship could be obtained from the known dependencies. Therefore, the claimed invention meets the criterion of "inventive step".

In the particular case of the first and second blocks of rationing level can be made in the form of multipliers of the input signal by the reciprocal of the average signal amplitude. The mixer signal may be an adder signals.

The similarity evaluator teams may adder absolute differences consistently arriving at its first and second input signals. The first and second blocks of memory can be RAM. The reference block may be a ROM. Blocks of rationing team time and rating of noise over time can be made in the form of parallel registers. The blocks defining the beginning and the end of the command and segmentation into syllables can be the same as the corresponding blocks of the device prototype.

The invention is illustrated graphics, while figure 1 presents a functional diagram of a device for the recognition of speech commands in terms of noise; figure 2 - block diagram of the algorithm for determining the beginning and end of the command; figure 3 - block diagram of the segmentation algorithm commands into syllables; figure 4 - block diagram of the algorithm is formirovaniya team level; figure 5 - block diagram of the identification algorithm; figure 6 - graphs of the probability of correct recognition of speech commands (%) relative level command signal to the background noise (dB), where curve 1 corresponds to the claimed device, curve 2 - the prototype.

The device recognition of speech commands in terms of noise contains input 1 command input 2 reference noise, block 3 defining the beginning and the end of the command, the second memory block 4, block 5 segmentation into syllables, the first memory block 6, block 7 of the regulation of noise at time, unit 8 commands reference, mixer signal 9, the first block of rationing by level 10, the second block of rationing level 11, the transmitter 12 to the similarity of commands, exit code 14, block 15 regulation team at the time. When the command input 1 is connected to the input unit 3 defining the beginning and the end of the command and second input units 5 segmentation into syllables and 6 of the memory, the output unit 3 defining the beginning and the end of the command is connected with the first inputs of the first memory block 6 and block 5 segmentation into syllables. The output of the first memory block 6 connected to the first input unit 15 of the regulation team at the time. The output unit 15 of the regulation team time connected to the first input unit 10 of the regulation level, the output of which is connected to the first input of the transmitter 12 of the similarity of the teams. Input 2 reference noise is connected to eromu the input of the second unit 4 memory to the second input of which is connected to the output unit 3 defining the beginning and the end of the command. The output of the second unit 4 memory connected to the first input unit 7 regulation of noise in time. The output of block 5 segmentation into syllables connected to the second inputs of the block 15 rationing team time block 7 of the regulation of noise at time unit 10 of the regulation on the level block 11 of the regulation level, unit 8 commands reference, and to the third inputs of the blocks 4 and 6 of memory. The output of block 7 of the regulation of noise at a time is connected to the first input of the mixer signals 9, to the second input of which is connected to the output unit 8 commands reference, the first input of which is connected to the first output of the transmitter 12 of the similarity of the teams. The mixer output signal 9 is connected to the first input of the second unit 11 of the regulation level, the output of which is connected to a second input of the transmitter 12 of the similarity of the teams. The second output of the transmitter 12 of the similarity of the commands is the exit code 14 device recognition of speech commands in terms of noise.

In one of the embodiments of the memory blocks 4 and 6 may be a RAM unit 8 commands reference - ROM. Unit 3 defining the beginning and the end of the command, unit 5 segmentation into syllables, and the transmitter 12 to the similarity of commands can be identical to the corresponding blocks of the device prototype. Blocks 15 regulation team at times and 7 of the regulation of noise at a time can be in the form of parallel registers. Blocks 10 and 11 of regulation level can be made in the form of calculators in accordance with the block diagram of figure 4. The mixer signal 9 may be an adder.

The device recognition of speech commands in terms of noise is as follows. On the command input 1 signal voice commands spoken by the man in the noise condition. Simultaneously to the input reference noise signal of the reference noise. Reference noise can be obtained, for example, using a microphone located at some distance from the microphone, the receiving team. Reference noise may also be a noise, isolated somehow from a mixture of commands and noise. In the block defining the beginning and the end is the beginning and the end of the command, for example, in accordance with the flowchart of figure 2. In step 1 are assigned zero initial values: sum of modules of the samples of the signal Su from Km counts, the number To the current count in the window and the number i of the current reference signal. Step 2 to Su added value of the i-th reference signal, then the number To the current count in the window and the number i of the current count is incremented. If It is equal to the maximum number of samples in the window, Km, step 3 takes you to the condition of step 4, if not, it repeats step 2. In step 4 analyzes the average value of the modulus of the sum Su in the window. If Su/Km exceeds poro the TV is S0, then in step 6, the value of F1 is assigned to the unit and unit 3 is formed impulse, allowing the write command in the memory unit 6, and the reference noise to block 4 in the memory. In addition, this pulse enables the command processing unit 5 segmentation into syllables. If Su/Km does not exceed the threshold value SO, we analyzed the following window. After exceeding the average module Su/Km threshold, steps 8, 9, 11 is the analysis of the average module of the signal in the window is similar to steps 2, 3, 5. If step 10 Su/Km less than the threshold value SO that the pulse shaping unit 3 is stopped.

Segmentation commands to the syllables in the block 5 may be performed, for example, in accordance with the block diagram of figure 3. Let S be a set of discrete samples of a team. S is divided into N parts, I discrete samples in each, S(i,j) - i-th reference j-th part S. In step 1 are formed initial value: number of syllables Ns=1, the number of parts of the signal in the I samples in each syllable SL(K) (K is the number of the syllable) SL(1)=0, non current portion of the signal j=1. Step 2 is the average level Pm of the whole team by the formula

In step 3 is determined by the average current portion of the signal P(j) by the formula

and the value R(0) is assigned the value P(1), it is necessary that the values of R(0) and P(1) are equally correlated with the Pm, otherwise C is glue on the first pass through steps 4, 5, 7 with the unknown value of P(j=0) (initial value j=1) can falsely increase Ns number of syllables. In step 4, the values of P(j) and P(j-1) are compared with the value of Pm. If P(j) and P(j-1) not less than Pm, then in step 5, the number of parts of SL(Ns) in the current syllable Ns is incremented, and then analyzes the next part. Then there is a transition to step 8. If the condition of step 4 is not performed, then there is a transition to the condition of step 6. If P(j) and P(j-1) is less than Pm, i.e. the condition of step 6 is executed, performs the steps of step 5, if not, in step 7, the number of syllables Ns is incremented, the value of parts of the signal in the current syllable SL(Ns) is assigned to the unit, and the current part number of changes to the following. Step 8 checks whether the current portion of the signal of the latter. If "no", then the steps are repeated starting with step 3, if "Yes", then stop the algorithm. In the segmentation result is determined by the number of syllables in Ns team, as well as the length L(K) of each syllable (K=1...Ns).

where f0- sampling rate command. Then during the duration of each syllable L(K) (K=1...Ns) unit 5 generates pulses with a frequency of

where L0the normalized duration of the syllable;

to set the reduction ratio of the frequency f0approximately 0,2-0,3.

When the post is of the first pulse from unit 5 to the output unit 6 memory with a frequency f 0starting to get the counts command, and the output unit 4 memory - splitting noise. In addition, upon receipt of each pulse from block 5 block 15 rationing command in time of a sequence of samples from unit 6 to the memory, selects one sample immediately following the pulse from unit 5, while the block 7 of the regulation of noise at a time selects the reference coming from the unit 4 to the memory, and the block 8 reference command selects the next reference reference command, which compares the currently recognized command. Also, each pulse received from the block 5, allows the recording of the samples arriving at the inputs of the blocks 10 and 11 of the regulation level. Scheme of work units 7 and 15, for example, may conform to the schema of a parallel register.

In the mixer signal 9 is added to the counts of the reference noise coming from unit 7 to the timing reference commands coming from the unit of standard commands 8. Mixed with the reference noise reference command entered in block 11 of the regulation on the level and recognized command in block 10 of the regulation level. Rationing teams level is, for example, in accordance with the block diagram of figure 4. Step 1 is determined by the average level of the Su command, and assigns the initial value of i=0 the number of the current reference, then in step 2, the current of the team divided by the average level and the number of the current count is incremented (i=i+1. If the end condition of the command in step 3 is not performed, i.e. if i is not equal to N(S), then step 2 is repeated. As a result of application of the algorithm the block diagram of figure 4, the command specified by level.

Next, the samples recognizable and reference commands are received in the transmitter similarity teams 12. In block 12 is determined by the similarity (1) between the recognizable and the reference commands (the distance between the commands), for example, according to the algorithm the block diagram of figure 5. In step 1, the value of the minimum distance Rm between the teams assigned obviously of great importance Rm=Rmax, the reference command is assigned a value of zero With=0, the reference number of the command that gave the minimum distance is assigned to a non-existent team number, for example, Cm=-1. In step 2 checks the equality of the lengths of the teams. If the length of the teams are equal, then in step 3 is the distance between the teams as the sum of the absolute differences of the samples teams

If the length of the teams are not equal, then in step 4, the distance is given knowingly larger value of R=Rmax. In step 5, the obtained distance is compared with the existing minimum distance Rm, a threshold R0. If the distance R is less than Rm and R0, then in step 6, the minimum distance is assigned the value obtained races of the situation, number Cm reference team that gave the minimum distance is assigned the value of non current standard, and there is a transition to the next pattern. If the condition of step 5 is not performed, then in step 7, there is a transition to the next pattern. In step 8 determines whether the current reference command last, if "no", then step 9 unit 12 generates a pulse that sets the following standard block 8, further compares the recognized command with another command reference (from step 2). If the current reference command to the latter, the output of block 12 is supplied is non Cm reference command, which defines the minimum distance. When the input unit 8 pulse from the second output of the transmitter 12, the output of the block serves timing of the next reference commands. Then the blocks 9, 11, 12 perform their functions, as above. After comparing the recognized commands with all the reference output code 14 will either be a code corresponding to a - 1 (if the distance is defined with all the standard commands more R0), or code corresponding to the number of one of the standard commands that are stored in block 8 of the reference commands.

The above data confirm that the implementation of the use of the claimed device the following cumulative conditions:

- a means of embodying explicit in the certain device in its implementation, designed to build systems voice control of the vehicle, working in conditions of acoustic noise;

for the claimed device, as it is characterized in the claims, confirmed the possibility of its implementation using the steps described in the application or known before the priority date tools and methods;

the tool embodying the claimed invention in its implementation, is able to achieve perceived by the applicant of the technical result. Thus, the claimed invention meets the criterion of "industrial applicability".

The device recognition of speech commands in terms of noise has been tested in the laboratories of Ulyanovsk state technical University, for which it was modelled on the PC in the Delphi environment. We used experimental tools: computer type IBM with a clock frequency of 600 Mhz processor, sound card SB Live and dynamic microphone. As the noise background voice commands and reference noise was used noise aircraft engine, recorded in real conditions. The tests were performed on the dictionary reference teams, consisting of 23 words aviation terminology. Assessment of the probability of achieving each team from the dictionary was uttered three times, then 23×3 receve the team was correlated with 23 standards. Commands spoken by a single speaker (male voice). The characteristics shown by the device under test in different noise conditions, is shown in chart 6, which shows the characteristics of the prototype. Thus, testing has confirmed the achievement of the technical result is an increase in probability of correct recognition of the team in terms of noise, for example, for the signal-to-noise of 3 dB the probability of correct detection teams have claimed device is 92%, which is 15% more than the probability of correct detection is 77% of that of the prototype.

The device recognition of speech commands in terms of noise, containing the input speech command, the block defining the beginning and the end of the command, the first memory block, the block segmentation into syllables, the power regulation command time, a unit of standard commands and the similarity evaluator commands, and the output of the block defining the beginning and the end of the command is connected with the first inputs of the first memory block and the block segmentation into syllables, the output of the first memory block connected to the first input unit rationing team time, a second input connected to the output of the block segmentation into syllables, characterized in that it introduced the reference input noise the second memory block, the unit of measurement noise on time, the first and second blocks of rationing level and the mixer signal, the ri the input voice command is connected to the input of the block defining the beginning and the end of the command with the second inputs of the block segmentation into syllables and the first memory block, the reference input noise connected to the first input of the second memory block, the second input of which is connected to the output of the block defining the beginning and the end of the command, the output of the second memory block is connected to the first input unit of measurement noise at time, the output of block segmentation into syllables connected to the second inputs of the unit of measurement noise on time, the first block of rationing level, the second unit of regulation on the level of the reference block teams and third inputs of the first and second memory blocks, the output unit of measurement noise at time connected to the first input of the mixer signals to the second input of which is connected to the output of the reference block of commands the first input of which is connected to the first output of the transmitter affinity teams, the mixer output signal is connected to the first input of the second block of rationing level, the output of which is connected to a second input of the transmitter similarity commands to the first input of which is connected to the output of the first block of rationing level, the first input connected to the output of the power rating of the team by the time the second output evaluator khojeste command is the output of speech recognition commands in terms of noise.



 

Same patents:

The invention relates to the transmission of speech

The invention relates to a communication system and is used to perform encoding with linear prediction, excited by the ID variable speed

FIELD: radio engineering.

SUBSTANCE: device has block for determining beginning and end of command, first memory block, block for syllable segmentation, block for time normalization of command, standard commands block, commands likeness calculator, while output of block for determining beginning and end of command is connected to first inputs of first memory block and syllable segmentation block, output of first memory block is connected to first output of command time normalization block, second output of which is connected to output of syllable segmentation block. Device additionally has supporting noise input, second memory block, block for time normalization of noise, first and second blocks for level normalization, signals mixer, while input of speech command is connected to output of block for determining beginning and end of command and to second inputs of syllable segmentation block and first memory block, bearing noise input is connected to first input of second memory block, to second input of which output of block for determining beginning and end of command is connected, output of second memory block is connected to first input of block for time normalization of noise, output of syllable segmentation block is connected to second inputs of block for time normalization of noise, of first and second level normalization block, standard commands block, and to third inputs of first and second memory block, output of block for time normalization of noise is connected to first input of signals mixer, to second input of which output of standard commands block is connected, first input of which is connected to first output of commands likeness calculator, output of signals mixer is connected to first input of second level normalization block, output of which is connected to second input of commands likeness calculator, to first input of which output of first level normalization block is connected, first input of which is connected to output of block for time normalization of command.

EFFECT: higher probability of correct command recognition during effect from noises.

6 dwg

FIELD: technology for analyzing speech under unfavorable environmental conditions.

SUBSTANCE: during transformation of spoken command first circular buffer is continuously filled with digitized signal, comb of recursive filters is applied to multiply loosened signal and spectral components are utilized to fill second circular buffer, limits of speech fragment are determined within it on basis of adaptive estimate of noise environment, spectral components of speech fragment are transferred to linear analysis buffer, shortened sign space is received from aforementioned buffer and produced spectral components are compared to standard vectors of database commands.

EFFECT: utilization of device under conditions of, for example, moving vehicle or mechanical industry with high noise pollution level provides for stable recognition of commands independently on particularities of narrators pronunciation, decreased memory volume.

7 cl, 2 dwg

FIELD: physics.

SUBSTANCE: invention relates to noise evaluation, particularly to evaluation of noise in signals used for identifying images. The method and device evaluate additive noise in a noisy signal using step-by-step Bayesian analysis. Prior distribution of time-varying noise is allowed for, and hyperparametres (average value and dispersion) are recursively corrected using approximation for posterior noise, calculated at the previous step. Additive noise in the time domain is presented in the region of logarithmic spectrum or cepstrum before step-by-step Bayesian analysis. Results of both evaluations of average value and dispersion for noise for each separate frame are used for extension of speech signals in the same region of logarithmic spectrum or cepstrum.

EFFECT: more efficient evaluation of noise in signals when identifying images.

20 cl, 4 dwg

FIELD: information technology.

SUBSTANCE: alternative sensor signal is generated, wherein the alternative sensor is less sensitive to ambient noise than the microphone which is based on the principle of air conduction. A signal of the air conduction based microphone is generated. The signal of the alternative sensor and the signal of the air conduction based microphone are used to estimate the likelihood L(St) of the speech status St by estimating the separate likelihood component for each of the set of frequency components and merging the separate likelihood components to form an estimate of the likelihood of the speech status. The likelihood the speech status is used to estimate the value of reduced noise, which models the value of the reduced noise for the given speech status. The likelihood of the speech status is used together with the signal of the alternative sensor and the air conduction based microphone in order to estimate the value of clean speech for the clean speech signal.

EFFECT: generation of a high-quality speech signal.

13 cl, 6 dwg

FIELD: physics.

SUBSTANCE: method is performed by analysing metadata to determine, whether or not the metadata actually are or include profile metadata indicating the target profile, wherein the profile metadata are suitable for performing, at least, one of volume control, volume normalization, or dynamic range control of audio data in accordance with the target profile. The target profile determines the target loudness and/or, at least, one target characteristic of the dynamic range subjected to the rendering of the audio data version for playback by the audio playback device from the group of audio playback devices.

EFFECT: ensuring the reception of bit streams.

19 cl, 17 dwg

Up!