首页资源分类DSP > XC166Lib_Keil_userManual

XC166Lib_Keil_userManual

已有 445466个资源

下载专区

文档信息举报收藏

标    签:bldc

分    享:

文档简介

XC166Lib_Keil_userManual

文档预览

User’s Manual for Keil Compiler, V 1.2, November 2003 XC166Lib A DSP Library for XC16x Family Microcontrollers Never stop thinking. Edition 2003-11 Published by Infineon Technologies AG 81726 München, Germany © Infineon Technologies AG 2006. All Rights Reserved. LEGAL DISCLAIMER THE INFORMATION GIVEN IN THIS APPLICATION NOTE IS GIVEN AS A HINT FOR THE IMPLEMENTATION OF THE INFINEON TECHNOLOGIES COMPONENT ONLY AND SHALL NOT BE REGARDED AS ANY DESCRIPTION OR WARRANTY OF A CERTAIN FUNCTIONALITY, CONDITION OR QUALITY OF THE INFINEON TECHNOLOGIES COMPONENT. THE RECIPIENT OF THIS APPLICATION NOTE MUST VERIFY ANY FUNCTION DESCRIBED HEREIN IN THE REAL APPLICATION. INFINEON TECHNOLOGIES HEREBY DISCLAIMS ANY AND ALL WARRANTIES AND LIABILITIES OF ANY KIND (INCLUDING WITHOUT LIMITATION WARRANTIES OF NON-INFRINGEMENT OF INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY) WITH RESPECT TO ANY AND ALL INFORMATION GIVEN IN THIS APPLICATION NOTE. Information For further information on technology, delivery terms and conditions and prices please contact your nearest Infineon Technologies Office (www.infineon.com). Warnings Due to technical requirements components may contain dangerous substances. For information on the types in question please contact your nearest Infineon Technologies Office. Infineon Technologies Components may only be used in life-support devices or systems with the express written approval of Infineon Technologies, if a failure of such components can reasonably be expected to cause the failure of that life-support device or system, or to affect the safety or effectiveness of that device or system. Life support devices or systems are intended to be implanted in the human body, or to support and/or maintain and sustain and/or protect human life. If they fail, it is reasonable to assume that the health of the user or other persons may be endangered. User’s Manual for Keil Compiler, V 1.2, November 2003 XC166Lib A DSP Library for XC16x Family Guangyu Wang AI MC MA TM (Munich, Germany) Microcontrollers Never stop thinking. XC166Lib Revision History:2003-11V 1.2 Previous Version: 2002-02 V 1.0, 2002-09 V1.1 Page Subjects (major changes since last revision) We Listen to Your Comments Any information within this document that you feel is wrong, unclear or missing at all? Your feedback will help us to continuously improve the quality of this document. Please send your proposal (including a reference to this document) to: C166Lib-support@infineon.com DSP Library for XC16x Family Table of Contents Page 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.1 Introduction to XC166Lib, a DSP Library for XC16x . . . . . . . . . . . . . . . . . 14 1.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Future of the XC166Lib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 Support Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Installation and Build . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.1 XC166Lib Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Installing XC166Lib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Building XC166Lib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Source Files List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 3.1 3.2 3.3 3.4 3.4.1 3.4.2 3.4.3 3.4.4 3.4.5 DSP Library Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 XC166Lib Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Calling a DSP Library Function from C Code . . . . . . . . . . . . . . . . . . . . . . 19 XC166Lib Example Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 XC166Lib Implementation - A Technical Note for Keil Compiler . . . . . . . . 19 Memory Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Memory Models for Keil C Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Optimization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Cycle Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Testing Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4 4.1 4.1.1 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 4.2.6 4.3 4.3.1 4.3.2 4.3.3 4.3.4 4.4 4.4.1 4.4.2 4.4.3 4.4.4 4.4.5 Function Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Argument Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Arithmetic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Complex Number Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Complex Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Complex Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Complex Number Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 FIR Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Transversal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Lattice Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Multirate FIR Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 IIR Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Direct Form 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Direct Form 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Cascaded Biquad IIR Filter with Direct Form 2 . . . . . . . . . . . . . . . . . . . 75 Cascaded Biquad IIR Filter with Direct Form 1 . . . . . . . . . . . . . . . . . . . 75 Lattice IIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 User’s Manual for Keil Compiler I-1 V 1.2, 2003-11 DSP Library for XC16x Family 4.4.6 4.5 4.5.1 4.5.2 4.6 4.6.1 4.6.2 4.6.3 4.6.4 4.6.5 4.7 4.7.1 4.7.2 4.7.3 4.7.4 4.8 4.8.1 4.9 4.10 4.10.1 4.10.2 4.10.3 5 Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Adaptive Digital Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Delayed LMS algorithm for an adaptive filter . . . . . . . . . . . . . . . . . . . 112 Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Fast Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Radix-2 Decimation-In-Time FFT Algorithm . . . . . . . . . . . . . . . . . . . . 126 Radix-2 Decimation-In-Frequency FFT Algorithm . . . . . . . . . . . . . . . . 128 Complex FFT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Calculation of Real Forward FFT from Complex FFT . . . . . . . . . . . . . 133 Calculation of Real Inverse FFT from Complex FFT . . . . . . . . . . . . . . 134 C166S V2 Core Implementation Note . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Organization of FFT functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Implementation of Real Forward FFT . . . . . . . . . . . . . . . . . . . . . . . . . 136 Implementation of Real Inverse FFT . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Mathematical Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Statistical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Implementation Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 User’s Manual for Keil Compiler -2 V 1.2, 2003-11 DSP Library for XC16x Family List of Figures Page Figure 4-1 Figure 4-2 Figure 4-3 Figure 4-4 Figure 4-5 Figure 4-6 Figure 4-7 Figure 4-8 Figure 4-9 Figure 4-10 Figure 4-11 Figure 4-12 Figure 4-13 Figure 4-14 Figure 4-15 Figure 4-16 Figure 4-17 Figure 4-18 Figure 4-19 Figure 4-20 Figure 4-21 Figure 4-22 Figure 4-23 Figure 4-24 Figure 4-25 Figure 4-26 Figure 4-27 Figure 4-28 Figure 4-29 Figure 4-30 Figure 4-31 Figure 4-32 Figure 4-33 Figure 4-34 Figure 4-35 Figure 4-36 Figure 4-37 Figure 4-38 Figure 4-39 Figure 4-40 Figure 4-41 Figure 4-42 Figure 4-43 The Complex Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 16-bit Complex number representation . . . . . . . . . . . . . . . . . . . . . . . . 29 Complex Number addition for 16 bits. . . . . . . . . . . . . . . . . . . . . . . . . . 31 Complex number subtraction for 16 bits . . . . . . . . . . . . . . . . . . . . . . . 34 Complex number multiplication for 16 bits . . . . . . . . . . . . . . . . . . . . . . 36 32 bit real number multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Block Diagram of the FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Lattice FIR Filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Block Diagram of FIR Decimation Filter . . . . . . . . . . . . . . . . . . . . . . . . 43 Equivelent Implementation of FIR Decimation Filter . . . . . . . . . . . . . . 43 Block Diagram of FIR Interpolation Filter . . . . . . . . . . . . . . . . . . . . . . . 44 Equivelent Implementation of FIR Interpolation Filter . . . . . . . . . . . . . 44 Fir_16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Fir_32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Fir_cplx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Fir_sym . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Fir_lat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Fir_dec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Fir_inter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Canonical Form (Direct Form 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Cascaded Biquad IIR Filter with Direct Form 2 Implementation . . . . . 75 Cascaded Biquad IIR Filter with Direct Form 1 Implementation . . . . . 76 Lattice IIR Filter Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 IIR_1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 IIR_2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 IIR_bi_1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 IIR_bi_2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 IIR_bi2_32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 IIR_bi_24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 IIR_lat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Adaptive filter with LMS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Adap_filter_16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Adap_filter_32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Complexity Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 8-point DIT FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Alternate Form of 8-point DIT FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 8-point DIF FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Alternate Form of 8-point DIF FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8-point DIT complex FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Butterfly of Radix-2 DIT complex FFT . . . . . . . . . . . . . . . . . . . . . . . . 130 8-point DIF complex FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Butterfly of Radix-2 DIF complex FFT . . . . . . . . . . . . . . . . . . . . . . . . 132 Bit_reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 User’s Manual for Keil Compiler II-1 V 1.2, 2003-11 DSP Library for XC16x Family Figure 4-44 Figure 4-45 Figure 4-46 Figure 4-47 Figure 4-48 Figure 4-49 Figure 4-50 Figure 4-51 Figure 4-52 Figure 4-53 Figure 4-54 Figure 4-55 Figure 4-56 Figure 4-57 Figure 4-58 Figure 4-59 Figure 4-60 Figure 4-61 FloatTo1Q15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Real_DIT_FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Real_DIF_IFFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Q15toFloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Matrix_mul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Matrix_trans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Sine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 P_series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Windowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Division of two 1Q15 fractional inputs . . . . . . . . . . . . . . . . . . . . . . . . 178 Division of 1Q31 fractional by 1Q15 fractional . . . . . . . . . . . . . . . . . . 180 Raw autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Biased autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Unbiased autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Raw cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Biased cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Unbiased cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 User’s Manual for Keil Compiler -2 V 1.2, 2003-11 DSP Library for XC16x Family List of Tables Page Table 2-1 Table 2-2 Table 3-1 Table 4-1 Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Source files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 XC166Lib Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Argument Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 User’s Manual for Keil Compiler III-1 V 1.2, 2003-11 DSP Library for XC16x Family Preface This is the User Manual of version 1.2 for XC166Lib - a DSP library for Infineon XC16x microcontroller family. To make the user easy by using the XC166Lib we provide a separate User Manual for each compiler to describe the implementations. This User Manual describes the implementations for Keil compiler. XC16x microcontroller family is the most recent generation of the popular C166 microcontroller families. The core of XC16x family is C166S V2 Core that combines high performance with enhanced modular architecture. Impressive DSP performance and advanced interrupt handling with fast context switching make XC16x the instrument of choice for powerful applications. This manual describes the implementation of essential algorithms for general digital signal processing applications on the XC16x using Keil compiler. For Keil compiler the DSP library is developed mainly in inline assembly and most of the source code files is stored in c files. The DSP library can be used as a library of basic functions for developing bigger DSP applications on XC16x microcontroller. The library serves as a user guide and a reference for XC16x microcontroller DSP programmers. It demonstrates how the processor’s architecture can be exploited for achieving high performance. The various functions and algorithms implemented and described are: • Arithmetic Functions • Filters – FIR – IIR – Adaptive filters • Transforms – FFT – IFFT • Matrix Operations • Mathematical Operations • Statistical Functions – Auto-correlation – Cross-correlation – Convolution Each function is described in detail under the following headings: Signature: This gives the function interface. Inputs: Lists the inputs to the function. User’s Manual for Keil Compiler -10 V 1.2, 2003-11 DSP Library for XC16x Family Outputs: Lists the output of the function if any. Return: Gives the return value of the function if any. Implementation Description: Gives a brief note on the implementation, the size of the inputs and the outputs, alignment requirements etc. Pseudocode: The implementation is expressed as a pseudocode using C conventions. Techniques: The techniques employed for optimization are listed here. Register Usage: Lists the registers that are used for parameter transfer from C to Assembly or inverse. Assumptions: Lists the assumptions made for an optimal implementation such as constraint on DPRAM. The input output formats are also given here. Memory Note: A detailed sketch showing how the arrays are stored in memory, the alignment requirements of the different memories, the nature of the arithmetic performed on them. The diagrams give a great insight into the actual implementation. Further, the path of an Example calling program, the Cycle Count and Code Size are given for each function. User’s Manual for Keil Compiler -11 V 1.2, 2003-11 DSP Library for XC16x Family Organization Chapter 1, Introduction, gives a brief introduction of the XC166Lib and its features. Chapter 2, Installation and Build, describes the XC166Lib content, how to install and build the XC166Lib. Chapter 3, DSP Library Notations, describes the DSP Library data types, arguments, calling a function from the C code and the assembly code, and the implementation notes. Chapter 4, Function Descriptions, describes the arithmetic functions, FIR filters, IIR filters, Adaptive filters, Fast Fourier Transforms, Matrix operations and Mathematical operations. Each function is described with its signature, inputs, outputs, return, implementation description, pseudocode, techniques used, registers used for parameter transfer, assumptions made, memory note, example, cycle count and code size. Chapter 5, gives the list of related references. Acknowledgements All source codes in XC166Lib have been designed, developed and tested using Tasking and Keil Tool chains. We in advance would like to acknowledge users for their feedback and suggestions to improve this product. Guangyu Wang XC166Lib Developer - Infineon User’s Manual for Keil Compiler -12 V 1.2, 2003-11 DSP Library for XC16x Family Acronyms and Definitions Acronyms Definitions DIT Decimation-In-Time DIF Decimation-In-Frequency LMS Least Mean Square DSP Digital Signal Processing XC166Lib DSP Library functions for XC16x FFT Fast Fourier Transform FIR Finite Impulse Response IIR Infinite Impulse Response Documentation/Symbol Conventions The following is the list of documentation/symbol conventions used in this manual. Documentation/Symbol Conventions Documentation/Sym- Description bol convention Courier Pseudocode Times-italic File name User’s Manual for Keil Compiler -13 V 1.2, 2003-11 1 Introduction DSP Library for XC16x Family Introduction 1.1 Introduction to XC166Lib, a DSP Library for XC16x The XC166Lib, a DSP Library for XC16x microcontroller is C-callable, hand-coded assembly, general purpose signal processing routines. The XC166Lib includes commonly used DSP routines. The throughput of the system using the XC166Lib routines is considerably better than those achieved using the equivalent code written in ANSI C language. The XC166Lib significantly helps in understanding the general purpose signal processing routines, its implementation on XC16x microcontroller family. It also reduces the DSP application development time. Furthermore, The XC166Lib is also a good reference for XC16x microcontroller DSP programmer. The routines are broadly classified into the following functional categories: • Arithmetic functions • FIR filters • IIR filters • Adaptive filters • Fast Fourier Transforms • Matrix operations • Mathematical operations • Statistical functions 1.2 Features • Common DSP algorithms with source codes • Hand-coded and optimized assembly modules with CoMAC instructions • C-callable functions on Keil compiler • Multi platform support - Win 95, Win 98, Win NT • Examples to demonstrate the usage of functions • Example input test vectors and the output test data for verification • Complete User’s manual covering many aspects of implementation 1.3 Future of the XC166Lib The planned future releases will have the following improvements. • Expansion of the library by adding more functions in the domain of generic core routines of DSP. • Upgrading the existing 16 bit functions to 32 bit 1.4 Support Information Any suggestions for improvement, bug report if any, can be sent via e-mail to User’s Manual for Keil Compiler 1-14 V 1.2, 2003-11 C166Lib-support@infineon.com. DSP Library for XC16x Family Introduction Visit www.infineon.com /C166DSPLIB for update on XC166Lib releases. User’s Manual for Keil Compiler 1-15 V 1.2, 2003-11 2 Installation and Build DSP Library for XC16x Family Installation and Build 2.1 XC166Lib Content The following table depicts the XC166Lib content with its directory structure. Table 2-1 Directory name XC166Lib Tasking Keil Source Include Docs Examples Directory Structure Contents Files Directories which has all the files related None to the XC166Lib Directories which has all the files related None to the XC166Lib for Tasking compiler Directories which has all the files related None to the XC166Lib for Keil compiler Directories of source files. Each directory *.c has respective assembly language *.asm implementation files of the library *.a66 functions Directory and common include files for DspLib_Keil.h ’C’ of the Keil compiler User’s Manual for Keil compiler *.pdf Example directories . Each directory *.c contains example "c" functions to depict the usage of XC166Lib. 2.2 Installing XC166Lib XC166Lib is distributed as a ZIP file. To install the XC166Lib on the system, unzip the ZIP file and extract them to the defined directory. The directory structure is as given in “XC166Lib Content” on Page 2-16 2.3 Building XC166Lib Include the DspLib_Keil.h into your project and also include the same into the files that need to call the library function like: #include “DspLib_Keil.h” Now include the respective source files for the required functionality into your project. Refer the functionality table, Table 2-2. User’s Manual for Keil Compiler 2-16 V 1.2, 2003-11 Build the system and start using the library. 2.4 Source Files List Table 2-2 Source files Complier: Keil Arithmetic Functions CplxAdd_16.c CplxSub_16.c CplxMul_16.c Mul_32.c FIR Filters Fir_16.c Fir_32.c Fir_cplx.c Fir_dec.c Fir_inter.c Fir_lat.c Fir_sym.c IIR Filters IIR_1.c IIR_2.c IIR_bi_1.c IIR_bi_2.c IIR_bi_24.c IIR_bi2_32.c IIR_lat.c Adaptive Filters Adap_filter_16.c Adap_filter_32.c FFT Bit_reverse.c real_DIT_FFT.a66 real_DIF_IFFT.a66 Matrix Operations User’s Manual for Keil Compiler 2-17 DSP Library for XC16x Family Installation and Build V 1.2, 2003-11 Table 2-2 Source files Matrix_mul.c Matrix_trans.c Mathematical Operations Power_series.c Windowing.c Sine.c div_q15.c div_q31.c Sqrt.c Statistical Functions Auto_raw.c Auto_bias.c Auto_unbias.c Cross_raw.c Cross_bias.c Cross_unbias.c Convol.c Miscellaneous FloatTo1Q15.c Q15toFloat.c DSP Library for XC16x Family Installation and Build User’s Manual for Keil Compiler 2-18 V 1.2, 2003-11 3 DSP Library Notations 3.1 XC166Lib Data Types The XC166Lib handles the following fractional data types. DSP Library for XC16x Family DSP Library Notations Table 3-1 XC166Lib Data Types 1Q15 (DataS) 1Q15 operand is represented by a short data type that is predefined as DataS in header files DspLib_Keil.h. 1Q31 (DataL) 1Q31 operand is represented by a long data type that is predefined as DataL in header files DspLib_Keil.h. CplxS Complex data type that contains the two 1Q15 data arranged in Re-Im format. CplxL Complex data type that contains the two 1Q31 data arranged in Re-Im format. 3.2 Calling a DSP Library Function from C Code After installing the XC166Lib, include a XC166Lib function in the source code as follows: 1. Choose the memory model for C compiler 2. Include the header file DspLib_Keil.h 3. Include the source file that contains required DSP functions into the project along with the other source files 4. Include the Keil compiler system files TRAPS.C and START_V2.A66 5. Set the Options for Target - Device to select an MCU with XC16x 6. Build the system 3.3 XC166Lib Example Implementation The examples of how to use the XC166Lib functions are implemented and are placed in examples subdirectory. This subdirectory contains a subdirectory for set of functions. 3.4 XC166Lib Implementation - A Technical Note for Keil Compiler 3.4.1 Memory Issues The XC16x microcontroller family uses the C166S V2 Core that is a 16 bit microcontroller core, with impressive DSP performance. There are two sets of instructions for C166S V2 Core, namely Normal Instruction Set and DSP Instruction Set (MAC-instructions). Normal instruction set is compatible with the microcontroller family C166, while the DSP instruction set is especially designed for implementing DSP algorithms. XC166Lib was User’s Manual for Keil Compiler 3-19 V 1.2, 2003-11 DSP Library for XC16x Family DSP Library Notations developed mainly using DSP instruction set. But the normal instruction set has been also often used in the routines, in particular, for initializing memories and registers. For each instruction set there is a different addressing mode. DSP instructions use some standard C166 addressing modes such as GPR direct and #data5 for immediate shift value. To supply the MAC instructions with up to 2 new operands in one CPU cycle, new MAC instruction addressing models have been added in C166S V2 Core. These allow indirect addressing with address pointer post-modification. Double indirect addressing requires 2 pointers, one of which can be supplied by any GPR, the other is provided by one of two Specific Function Registers (SFRs) IDX0 and IDX1. Two pairs of offset registers QR0/QR1 and QX0/QX1 are associated with each pointer (GPR or IDXi). The GPR pointer gives access to the entire memory space, whereas IDXi are limited to the internal Dual-Port RAM (DPRAM), except for the CoMOV instruction. The XC166Lib is implemented with the C166S V2 Core memory addressing architecture. The following information gives memory conditions in order to work properly. Because the specific function register IDXi is limited to the internal DPRAM, in order to use MAC instructions properly we must first initialize IDXi with the address pointed to DPRAM space from 00’F200H to 00’FE00H (3KBytes) before using MAC instructions. This means that we must locate one of operands in MAC-instructions with double indirect addressing modes in range from 00’F200H to 00’FE00H. Using Keil complier we can easily realize it through defining the variables which should be located in the DPRAM area as on-chip memory type idata, e.g. short idata x[n] ; After compiling the vector x will be automatically located in the DPRAM area because Keil compiler locates all variables with the memory type idata in the DPRAM area. Note that C166S V2 Core has defined 3 KBytes of DPRAM. However, the XC16x microcontroller family is equipped with only 2 KBytes of DPRAM in the range from 00’F600H to 00’FE00H. The limited DPRAM area can make difficulty by executing DSP algorithms on XC16x, if the size of the vector is larger than 2 KBytes, e.g. a 1024-point Fir filter. When using pointer post-modification addressing models, the address pointed to must be a legal address, even if its content is not modified. An odd value will trigger the classB hardware Trap (Illegal Word Operand Access Trap (ILLOPA)). 3.4.2 Memory Models for Keil C Compiler Just as we said, the DSP library is developed mainly for calling from a C program. Therefore, the memory modes selected by C and assembly modules must be same in order to avoid memory model conflicts. Keil tool chain supports seven memory models: tiny, small, compact, hcompact, medium, large and hlarge. The basic differences between them are: User’s Manual for Keil Compiler 3-20 V 1.2, 2003-11 DSP Library for XC16x Family DSP Library Notations Beside the tiny model operating in the non-segmented CPU mode the other six memory models operate in the segmented CPU mode. The variables by the tiny and small memory models are located in the near area and the functions calls generate near calls (up to 64Kb code size). The both memory modes provide the same efficient code. The difference of tiny and small memory models is that the code and data in the tiny model are limited to the first segment of 64K, while the small model allows code and data to locate anywhere in the space. The compact, hcompact and medium models operate as small model does, except that the compact model uses the memory type far for variables, the hcompact model uses the memory type huge for variables and the medium model generates function calls as far call. For both large and hlarge models the function calls are generated as far calls by default. The difference between them is that the large memory locates the variables in the far memory, while the hlarge locates the variables in the huge memory. Because the most functions in the library are written in the inline assembly, for those library functions the parameter passing will be done by the compiler automatically. So the library functions written in the inline assembly can be used in all memory models without regarding the memory types of pointers passed from C to assembly code. But for the library functions which are pure assembly module and stored in assembly file *.a66, if we want to use them in the memory models with far or huge variable memory types, we need to redefine the pointers in the function arguments as near pointers, because the library functions use only 16 bit pointers by parameter passing from C to assembly routine. There are only 16 registers R0-R15 on XC16x that can be used for programming. By calling the XC166Lib routines from C, according to Keil Compiler conventions the registers from R8 to R12 will be used to translate parameters from C to the assembly code, and the rest parameters will be translated through using stack. R4 and R5 are used to store the output values after executing the XC166Lib routines. R1-R7 can be used in the assembly routine without regard for preserving their contents. 3.4.3 Optimization Techniques DSP optimization techniques depend strongly on the core architecture. So, different cores have different optimization techniques. Furthermore, the number of tricks that can be played by a DSP programmer are endless. In the development of XC166Lib the following optimization techniques have been used. • data dependencies removing Due to the pipeline requirement of the C166S V2 CPU there are a lot of possible data dependencies between instructions using GPRs. In the C166S V2 Core the dedicated hardware is added to detect and resolve the data dependencies. However, in the C166S V2 Core none of the instructions using indirect addessing modes are capable User’s Manual for Keil Compiler 3-21 V 1.2, 2003-11 DSP Library for XC16x Family DSP Library Notations of using a GPR, which is to be updated by one of the two immediately preceding instructions. This means that the instruction using indirect addressing modes will lead two cycles stall. To use these two cycles for the optimization we can insert before this instruction a multicycle or two single cycle instructions that must not update the GPR used for indirect addressing. Example: Assembly without optimization ( 6 cycles ) ............. ADD R1, R2 MOV R8, [R1] ; instruction using indirect addressing mode ADD R5, R1 ADD R6, R1 ............. Assembly with optimization ( 4 cycles ) ............. ADD R1, R2 ADD R5, R1 ; inserted one cycle instruction ADD R6, R1 ; inserted one cycle instruction MOV R8, [R1] ; instruction using indirect addressing mode ............. • memory bandwidth conflicts removing Memory bandwidth conflicts can occur if instructions in the pipeline access the same memory ares at the same time. The CoXXX instructions are specially designed for DSP implementation. To avoid the memory bandwidth conflicts in the DPRAM ares, one of the operands should be located in the internal SRAM to guarantee a single cycle execution time of the CoXXX instructions. • instruction re-ordering By writing DSP routines with CoXXX instructions it is often needed to change and update the Special Function Registers (SFRs), such as IDX0, IDX1, QX0, QX1, QR0, QR1, and so on. CPU-SFRs control the CPU functionality and behavior. Therefore, special care is required to ensure that instructions in the pipeline always work with the correct SFRs values. With instruction re-ordering the flow of instructions through the pipeline can be improved to optimize the routines. Example: Assembly code without optimization ( 7 cycles ) ............. User’s Manual for Keil Compiler 3-22 V 1.2, 2003-11 EXTR #1 MOV IDX1, #12 ; initialize IDX1 with 12 CoMUL [IDX1], [R1] MOV R6, R1 ADD R2, R1 ............. Assembly code with optimization ( 5 cycles ) ............. EXTR #1 MOV IDX1, #12 ; initialize IDX1 with 12 MOV R6, R1 ; instruction re-ordering ADD R2, R1 ; instruction re-ordering CoMUL [IDX1], [R1] ............. • loop unrolling The equation is written twice or more inside a loop. Example (unrolling factor 2): Assembly code without loop unrolling ( 17 cycles ) ............ MOV R3, #3 loop: CoMAC [IDX0+], [R1+] CoMAC [IDX0+], [R1+] CMPD1 R3,#0h JMPR cc_NZ,loop ............ Assembly code with loop unrolling ( 13 cycles ) ............ MOV R3, #1 loop: CoMAC [IDX0+], [R1+] CoMAC [IDX0+], [R1+] CoMAC [IDX0+], [R1+] User’s Manual for Keil Compiler 3-23 DSP Library for XC16x Family DSP Library Notations V 1.2, 2003-11 CoMAC [IDX0+], [R1+] DSP Library for XC16x Family DSP Library Notations CMPD1 JMPR ............ R3,#0h cc_NZ,loop 3.4.4 Cycle Count The cycle count given for each function in this User’s Manual represents the cycles to be needed for executing the assembly instructions in the function. They have been verified on XC16x boards. To some degree one can understand as the theoretical cycle count. The given values of cycle count can only be achieved only on conditions that all data and assembly codes are located in the internal memory area and no pipeline conflicts occur in the program. Note that the real cycle count may be much larger than the given values, if the data or source code are located in the external memory. 3.4.5 Testing Methodology The XC166Lib is tested on XC16x board. The test is believed to be accurate and reliable. However, the developer assumes no responsibility for the consequences of use of the XC166Lib. User’s Manual for Keil Compiler 3-24 V 1.2, 2003-11 DSP Library for XC16x Family Function Descriptions 4 Function Descriptions Each function is described with its signature, inputs, outputs, return, implementation description, pseudocode, techniques used, assumptions made, register usage, memory note, example, cycle count and code size. Functions are classified into the following categories. • Arithmetic functions • FIR filters • IIR filters • Adaptive filters • Fast Fourier Transforms • Matrix operations • Mathematical operations • Statistical functions 4.1 Conventions 4.1.1 Argument Conventions The following conventions have been followed while describing the arguments for each individual function. Table 4-1 Argument X, x Y, y D_buffer, d_buffer N_x H, h N_h DataS DataL Argument Conventions Convention Input data or input data vector beside in FFT functions, where X representing the FFT spectra of the input vector x Output data or output data vector Delay buffer The size of input vectors Filter coefficient vector The size of coefficient vector H Data type definition equating a short, a 16-bit value representing a 1Q15 number Data type definition equating a long, a 32-bit value representing a 1Q31 number User’s Manual for Keil Compiler 4-25 V 1.2, 2003-11 DSP Library for XC16x Family Function Descriptions Table 4-1 Argument CplxS CplxL Argument Conventions Convention Data type definition equating a short, a 16-bit value representing a 1Q15 complex number Data type definition equating a long, a 32-bit value representing a 1Q31 complex number User’s Manual for Keil Compiler 4-26 V 1.2, 2003-11 4.2 Arithmetic Functions DSP Library for XC16x Family Function Descriptions 4.2.1 Complex Numbers A complex number z is an ordered pair (x,y) of real numbers x and y, written as z= (x,y) where x is called the real part and y the imaginary part of z. 4.2.2 Complex Number Representation A complex number can be represented in different ways, such as Rectangular form : C = R + iI [4.1] Trigonometric form : C = M[ cos(φ) + j sin (φ)] [4.2] Exponential form : C = Meiφ [4.3] Magnitude and angle form : C = M∠φ [4.4] In the complex functions implementation, the rectangular form is considered. 4.2.3 Complex Plane To geometrically represent complex numbers as points in the plane two perpendicular coordinate axis in the Cartesian coordinate system are chosen. The horizontal x-axis is called the real axis, and the vertical y-axis is called the imaginary axis. Plot a given complex number z= x + iy as the point P with coordinates (x, y). The xy-plane in which the complex numbers are represented in this way is called the Complex Plane. User’s Manual for Keil Compiler 4-27 V 1.2, 2003-11 (Imaginary Axis) y DSP Library for XC16x Family Function Descriptions P z = x + iy O (Real Axis) x Figure 4-1 The Complex Plane 4.2.4 Complex Arithmetic Addition if z1 and z2 are two complex numbers given by z1 =x1+iy1 and z2 = x2 + iy2, z1+z2 = (x1+iy1) + (x2 + iy2) = (x1+x2) + i(y1+y2) Subtraction if z1 and z2 are two complex numbers given by z1 =x1+iy1 and z2 = x2 + iy2, z1-z2 = (x1-x2) + i(y1-y2) Multiplication if z1 and z2 are two complex numbers given by z1 =x1+iy1 and z2 = x2 + iy2, z1.z2 = (x1+iy1).(x2 + iy2) = x1x2 + ix1y2 + iy1x2 + i2 y1y2 = (x1x2 - y1y2) + i(x1y2 + x2y1) [4.5] [4.6] [4.7] User’s Manual for Keil Compiler 4-28 V 1.2, 2003-11 DSP Library for XC16x Family Function Descriptions Conjugate The complex conjugate, z of a complex number z = x+iy is given by z = x - iy [4.8] and is obtained by geometrically reflecting the point z in the real axis. 4.2.5 Complex Number Schematic 15 0 High Addr. Imaginary Low Addr. Real Sign Bit Figure 4-2 16-bit Complex number representation 4.2.6 Descriptions The following arithmetic functions for 16 bit and 32 bit are described. • 16 bit complex addition • 16 bit complex subtraction • 16 bit complex multiplication • 32 bit real multiplication User’s Manual for Keil Compiler 4-29 V 1.2, 2003-11 CplxAdd_16 Signature DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Addition void CplxAdd_16 (CplxS* X, CplxS* Y, ClpxS* R) Inputs Output Return Implementation Description X : Pointer to 16 bit Complex input value in 1Q15 format Y : Pointer to 16 bit Complex input value in 1Q15 format None Pointer to the sum of two complex numbers as a 16 bit complex number in 1Q15 format This function computes the sum of two 16 bit complex numbers. Wraps around the result in case of overflow. The algorithm is as follows Rr = xr + yr Ri = xi + yi Pseudo code { R.real = X.real + Y.real; //add the real part R.imag = X.imag + Y.imag; //add the imaginary part return R; } Techniques None Assumption Register Usage • From .c file to inline assembly file: Decided by compiler [4.9] User’s Manual for Keil Compiler 4-30 V 1.2, 2003-11 CplxAdd_16 Memory Note DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Addition (cont’d) 15 0 Imaginary Real 15 0 Imaginary Real + + 15 0 Imaginary Real Figure 4-3 Complex Number addition for 16 bits Example Cycle Count Code Size C166Lib\Keil\Examples\Arith_16\Arith_16.c Initialization and 6 read input values Real Addition 3 Imaginary Addition 3 Total 12 Initialization and read input values Real Addition 12 bytes 10 bytes User’s Manual for Keil Compiler 4-31 V 1.2, 2003-11 CplxAdd_16 DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Addition (cont’d) Imaginary Addition 10 bytes Total 32 bytes User’s Manual for Keil Compiler 4-32 V 1.2, 2003-11 CplxSub_16 Signature DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Subtraction void CplxSub_16 (CplxS* X, CplxS* Y, CplxS* R ) Inputs Output Return Implementation Description X : Pointer to 16 bit complex input value in 1Q15 format Y : Pointer to 16 bit complex input value in 1Q15 format None Pointer to the difference of two complex numbers as a 16 bit complex number This function computes the difference of two 16 bit complex numbers. Wraps around the result in case of underflow. The algorithm is as follows. Rr = xr – yr Ri = xi – yi Pseudo code { R.real = X.real - Y.real; //subtract the real part R.imag = X.imag - Y.imag; //subtract the imaginary part return R; } Techniques None Assumption Register Usage • From .c file to inline assembly file: Decided by compiler [4.10] User’s Manual for Keil Compiler 4-33 V 1.2, 2003-11 CplxSub_16 Memory Note DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Subtraction (cont’d) 15 0 Imaginary Real 15 0 Imaginary Real 15 0 Imaginary Real Figure 4-4 Complex number subtraction for 16 bits Example Cycle Count Code Size C166Lib\Keil\Examples\Arith_16\Arith_16.c Initialization and 6 read input values Real Subtraction 3 Imaginary 3 Subtraction Total 12 Initialization and read input values Real Subtraction Imaginary Subtraction Total 12 bytes 10 bytes 10 bytes 32 bytes User’s Manual for Keil Compiler 4-34 V 1.2, 2003-11 CplxMul_16 Signature DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Multiplication void CplxMul_16 (CplxS* X, CplxS* Y, CplxS* R ) Inputs Return Implementation Description X : Pointer to 16 bit Complex input value in 1Q15 format Y : Pointer to 16 bit Complex input value in 1Q15 format Pointer to the multiplication result in 1Q15 format This function computes the product of the two 16 bit complex numbers. Wraps around the result in case of overflow. The complex multiplication is computed as follows. Rr = xr × yr – xi × yi Ri = xi × yr + xr × yi Pseudo code { R->real = X.real*Y.real - Y.imag*X.imag; R->imag = X.real*Y.imag + Y.real*X.imag; } Techniques None Assumption Register Usage • From .c file to inline assembly file: Decided by compiler User’s Manual for Keil Compiler 4-35 V 1.2, 2003-11 CplxMul_16 Memory Note DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Multiplication (cont’d) 15 0 Imaginary Real x 15 0 Imaginary x x Real x + 15 0 Imaginary - Real Figure 4-5 Complex number multiplication for 16 bits Example C166Lib\Keil\Examples\Arith_16\Arith_16.c Cycle Count Initialization and 5 read input values Real multiplication 3 Imaginary 3 multiplication Total 11 Code Size Initialization and read input values Real multiplication User’s Manual for Keil Compiler 4-36 10 bytes 12 bytes V 1.2, 2003-11 CplxMul_16 DSP Library for XC16x Family Function Descriptions 16 bit Complex Number Multiplication (cont’d) Imaginary multiplication Total 12 bytes 34 bytes User’s Manual for Keil Compiler 4-37 V 1.2, 2003-11 Mul_32 Signature 32 bit Real Multiplication DSP Library for XC16x Family Function Descriptions DataL Mul_32 (DataL X, DataL Y) Inputs Return Implementation Description X : 32 bit real input value in 1Q31 Y : 32 bit real input value in 1Q31 Multiplication result in 1Q31 format This function computes the product of the two 32 bit real numbers. Wraps around the result in case of overflow. The multiplication is computed as follows. R = xL × yL + xL × yH + (x H × yL + xH × yH ) » 16 Pseudo code { R = xL*yL + xL*yH + (xH*yL + xH*yH)>>16 ; } Techniques None Assumption Register Usage • From .c file to inline assembly file: Decided by compiler. • From .asm file to .c file: RL(LSW) is stored in R4. RH(MSW) is stored in R5. User’s Manual for Keil Compiler 4-38 V 1.2, 2003-11 Mul_32 Memory Note DSP Library for XC16x Family Function Descriptions 32 bit Real Multiplication (cont’d) 15 0 XL XH x 32 16 x x + YL x YH 15 0 RL RH 32 16 Figure 4-6 32 bit real number multiplication Example C166Lib\Keil\Examples\Arith_16\Arith_16.c Cycle Count Code Size Initialization Multiplication Output Total Initialization Multiplication Output Total 5 8 2 15 10 bytes 32 bytes 8 50 bytes User’s Manual for Keil Compiler 4-39 V 1.2, 2003-11 DSP Library for XC16x Family 4.3 FIR Filters The FIR (Finite Impulse Response) filter, as its name suggests, will always have a finite duration of non-zero output values for given finite duration of non-zero input values. FIR filters use only current and past input samples, and none of the filter's previous output samples, to obtain a current output sample value. For causal FIR systems, the system function has only zeros (except for poles at z=0). The realization of an FIR filter has many forms. But the transversal and lattice forms are most useful structures in practice. 4.3.1 Transversal Form The transversal form of an Fir filter showed in Figure 4-7 is realized by a tapped delay line. The delay line stores the past input values. The input x(n) for the current calculation will become x(n-1) for the next calculation. The output from each tap is summed to generate the filter output. For a general N tap FIR filter, the difference equation is N–1 ∑ y(n) = hi ⋅ x(n – i) i=0 where, [4.11] x(n) : the filter input for nth sample y(n) : output of the filter for nth sample hi : filter coefficients N : filter order The filter coefficients, which decide the scaling of current and past input samples stored in the delay line, define the filter response. The transfer function of the filter in Z-transform is ∑ H[z] = YX-----[[---zz---]] = N – 1 hi ⋅ z–i i=0 [4.12] User’s Manual for Keil Compiler -40 V 1.2, 2003-11 DSP Library for XC16x Family x(n) x(n) (Filter Input) h0 X z-1 x(n-1) z-1 h1 X Delay Line z-1 x(n-N+1) hN-1 X + Figure 4-7 Block Diagram of the FIR Filter y(n) (Filter Output) 4.3.2 Lattice Form The structure of a lattice FIR filter is showed in Figure 4-8 . Each stage of the filter has an input and output that are related by the equations: yi(n) = yi – 1(n) + kiui(n – 1) ui(n) = kiyi – 1(n) + ui – 1(n – 1) 1 ≤ i≤ M [4.13] The initial values are equal to the filter input x(n): User’s Manual for Keil Compiler -41 V 1.2, 2003-11 DSP Library for XC16x Family y0(n) = x(n) u0(n) = x(n) At the last stage we have the output of the lattice FIR filter y(n)=yM(n). x(n) k1 z-1 y1(n) + u1(n) k2 + z-1 + y2…(n). kM + …. z-1 u2(n) yM(n) + + uM(n) Figure 4-8 Lattice FIR Filter 4.3.3 Multirate FIR Filters Multirate filters are a kind of digital filters that change the sampling rate of a digital signal. A multirate filter converts a digital signal with sampling rate M to another digital signal with sampling rate N. The both digital signals represent the same analog signal at differtent sampling rates. A multirate filter can be realized in an FIR filter or an IIR filter. Due to the advantages of FIR filters, such as linear phase, unconditional stability, simple structure and easy coefficient design, most of the multirate filters are implemented with FIR filters. Hier we describe only FIR multirate filters. The basic operations of the multirate filters are decimation and interpolation. Decimation reduces the sample rate of a signal and can be used to eliminate redundant or unnecessary information contained in the signal. Interpolation increases the sample rate of a signal through filling in missing information between the samples of a signal based on the calculation on the existing data. 4.3.3.1 FIR Decimation Filter The FIR decimation filter can be described using the equation N–1 ∑ y(m) = h(k)x(mM – k) k=0 [4.14] User’s Manual for Keil Compiler -42 V 1.2, 2003-11 DSP Library for XC16x Family where h(k) is filter coefficient vector, x(n) is the input signal and M represents the decimation factor. Figure 4-9 shows the block diagram of an FIR decimation filter. Its equivelent form is showed in Figure 4-10 . x(n) w(n) h(k) LM y(m ) Figure 4-9 Block Diagram of FIR Decimation Filter h(0) x(n) LM z-1 h(1) z-1 h(2) z-1 h(3) i i z-1 i i i i i i h(N-1) i y(m) Figure 4-10 Equivelent Implementation of FIR Decimation Filter 4.3.3.2 FIR Interpolation Filter In comparison with decimation filter the interpolation filter can be used in the reconstruction of a digital signal from another digital signal. Figure 4-11 shows a block diagram of an interpolation filter, where the low-pass filter of the interpolator uses an Fir filter structure. An Fir interpolation filter can be described as N–1 ∑ y(m) = h(k)x((m – k) ⁄ L) k=0 [4.15] User’s Manual for Keil Compiler -43 V 1.2, 2003-11 DSP Library for XC16x Family for m-k=0, L, 2L, ... . An equivelent implementation of the Fir interpolation filter is showed in Figure 4-12. x(n) L h(k) Figure 4-11 Block Diagram of FIR Interpolation Filter y(m ) x(n) KL z-1 h(0) h(1) y(m) z-1 h(2) z-1 h(3) i i z-1 i i i i i i h(N-1) i Figure 4-12 Equivelent Implementation of FIR Interpolation Filter 4.3.4 Descriptions The following FIR filter routines are implemented in the library. • Fir filter with transversal form, 16 bit filter coefficients, Sample processing • Fir filter with transversal form, 32 bit filter coefficients, Sample processing • Complex Fir filter with transversal form, 16 bit filter coefficients, Vector processing • Symmetric Fir filter, 16 bit filter coefficients, Vector processing • Lattice Fir filter, 16 bit filter coefficients, Vector processong • Fir decimation filter, 16 bit filter coefficients, Vector processing • Fir interpolation filter, 16 bit filter coefficients, Vector processing User’s Manual for Keil Compiler -44 V 1.2, 2003-11 DSP Library for XC16x Family Fir_16 Signature Inputs Output Return Implementation Description FIR Filter with transversal form, 16 bit coefficients, Sample processing DataS Fir_16 ( DataS* H, DataS* IN, DataS N_h, DataS* D_buffer) H : Pointer to filter coefficients in 1Q15 format IN : Pointer to the new input sample in 1Q15 format L : Filter order D_buffer Pointer to delay buffer : Y : Filter output in 1Q15 format The implementation of FIR filter uses transversal structure (direct form). A single input is processed at a time and output for every sample is returned. The filter operates on 16-bit real input, 16-bit coefficients and gives 16-bit real output. The number of coefficients given by the user is arbitrary. The delay line is implemented in parallel to the multiply-accumulate operation using instructions CoMACM. Delay buffer will be located in the DPRAM area. User’s Manual for Keil Compiler -45 V 1.2, 2003-11 DSP Library for XC16x Family Fir_16 FIR Filter with transversal form, 16 bit coefficients, Sample processing (cont’d) Pseudo code { short x(N_h)={0,...}; short Y; short i; //Input vector //Filter result //Update the input vector with the new input value for(i=0; i0; i--) x(n-i-1) = x(n-i); x(n) = IN; //New input sample ;IIR filtering y(n) = 0; for(i=0 to N) y(n) = y(n)+b(i)*x(n-i); for(i=1 to N) y(n) = y(n)+a(i)*y(n-i); Y = y(n) return Y; } Techniques Assumptions Register Usage Memory Note //Filter Output returned • Memory bandwidth conflicts removing • Instruction re-ordering • Delay buffer must be located in DPRAM area • From .c file to .asm file: Decided by compiler • From .asm file to .c file: Y is stored in R4 User’s Manual for Keil Compiler -80 V 1.2, 2003-11 DSP Library for XC16x Family IIR_1 16 bit filter coefficients, Direct form 1, Sample processing (cont’d) High Addr. Low Addr. DPRAM y(n-1) : y(n-N+1) y(n-N) x(n-1) x(n-1) x(n-2) : x(n-M+2) x(n-M+1) x(n-M) High Addr. Low Addr. Memory a(0) a(1) : a(N-2) a(N-1) b(0) b(1) : b(M-2) b(M-1) b(M) 2Q14 Before Figure 4-24 IIR_1 IDX0 CoMACM CoMACM DPRAM y(n) : y(n-N+2) y(n-N+1) x(n) x(n) x(n-1) : x(n-M+3) x(n-M+2) x(n-M+1) Memory a(0) a(1) : a(N-2) a(N-1) b(0) b(1) : b(M-2) b(M-1) b(M) 2Q14 After IDX0 Example C166Lib\Keil\Examples\Filters\IIR\ IIRform1.c Cycle Count Memory Initialization 10 Read the new 2 sample into DPRAM User’s Manual for Keil Compiler -81 V 1.2, 2003-11 IIR_1 Code Size DSP Library for XC16x Family 16 bit filter coefficients, Direct form 1, Sample processing (cont’d) First IIR loop N+1 Repeat count re- 2 initialization Second IIR loop N Write the output 6 Total 2N + 21 Example: N=4 cycle = 29 Memory initialization Read the new sample into DPRAM First IIR loop Repeat count reinitialization Second IIR loop Write the output 20 bytes 8 bytes 10 bytes 4 bytes 10 bytes 22 bytes Total 74 bytes User’s Manual for Keil Compiler -82 V 1.2, 2003-11 DSP Library for XC16x Family IIR_2 Signature 16 bit filter coefficients, Direct form 2, Sample processing DataS IIR_2 ( DataS* h, DataS* IN, DataS N, DataS* u) Inputs h : Pointer to filter coefficients in 2Q14 IN : Pointer to new input sample in 1Q15 N : Filter order u : Pointer to state variable vector Output : Return Y : Filter output in 1Q15 format Implementation Description The IIR filter routine is implemented based on direct form II implementation (Nth Order) showed in Figure 4-20. The filter operates on 16-bit real input, 16-bit real coefficients and returns 16-bit real output. The number of inputs is arbitrary. Pseudo code { ; x(n) = input signal at time n ; u(n) = state variable at time n ; y(n) = output signal at time n ; a(k), b(k) = IIR filter coefficients ; N = M refer to the filter order in Equation [4.16] DataS a[N], b[N+1]; DataS u[N+1]; DataS i; //Filter vectors //state variable vector ;Calculate the sate variable at time n, u(n) u(n) = x(n); for(i=1; i0 is a constant called step-size. Note that the filter coefficients are time varying. h(i,n) denotes the value of the i-th coefficient at time n. The algorithm has three stages: 1. calculate the current filter output y(n). 2. calculate the current error value e(n) using the current expected value d(n) and currently calculated output y(n). 3. update the filter coefficients for next iteration using current error e(n) and samples x(n-k). Step-size µ controls the convergence of the filter coefficients to the optimal (or stationary) state. The larger the µ value, the faster the convergence of the adaptation. On the other hand, a large value of µ also leads to a large variation of h(i,n) (a bad accuracy) and thus a large variation of the output error (a large residual error). Therefore, the choice of µ is always a trade-off between fast convergence and high accuracy. µ must not be larger than a certain threshold. Otherwise, the LMS algorithm diverges. 4.5.1 Delayed LMS algorithm for an adaptive filter In the regular LMS adaptive filter the update of filter coefficients makes use of current error value and input value. This makes the choice of step-size µ more difficult due to the effect of µ on convergence of adaptive filter. To minimize the effect of µ on the filter convergence a delayed LMS algorithm for an adaptive filter is introduced. The algorithm of a delayed LMS adaptive filter can be represented by the following mathematical equations. User’s Manual for Keil Compiler -112 V 1.2, 2003-11 DSP Library for XC16x Family N–1 ∑ y(n) = h(k, n) ⋅ x(n – k) k=0 h(k, n + 1) = h(k, n) + x(n – k – 1) × µ × e(n – 1) e(n) = d(n) – y(n) where, [4.32] [4.33] [4.34] y(n) x(n) d(n) h(0,n),h(1,n),.. N µ e(n) : output sample of the filter at index n : input sample of the filter at index n : expected output sample of the filter at index n : filter coefficients at index n : filter order (number of coefficients) : step-size : error value at index n The algorithm has three stages: 1. calculate the current filter output y(n). 2. update filter coefficients for the next iteration using previous error value e(n-1) and the delayed input sample x(n-k-1). 3. calculate the current error value e(n) and store it in memory. 4.5.2 Descriptions In the DSP library, the delayed LMS adaptive filters are implemented. The following are the implemented delayed LMS adaptive FIR filter functions with 16 bit input and 16 bit or 32 bit filter coefficients. • 16 bit real coefficients, delayed LMS, Sample processing • 32 bit real coefficients, delayed LMS, Sample processing User’s Manual for Keil Compiler -113 V 1.2, 2003-11 DSP Library for XC16x Family Adap_filter_16 Signature Inputs Return Implementation Description 16 bit real coefficients, Delayed LMS, Sample Processing DataS Adap_filter_16 ( ) DataS* DataS* DataS* DataS* DataS DataS DataS* h, IN, D, error, N_h, Step, d_buffer h : Pointer to filter coefficients in 1Q15 IN : Pointer to new input value D : Pointer to the expected signal at time n error : Pointer to error signal N_h Filter size Step : Adaptive gain d_buffer Pointer to delay buffer Y : Output value of the filter in 1Q15 Delayed LMS algorithm has been used to realize an adaptive FIR filter. That is, the update of coefficients in the current instant is done using the error in the previous output. The FIR filter is implemented using transversal structure and is realized as a tapped delay line. In this routine, both of signals and filter coefficients have 16 bit precision. This routine processes one sample at a time and returns output of that sample. The input for which the output is to be calculated is sent as an argument to the function. User’s Manual for Keil Compiler -114 V 1.2, 2003-11 DSP Library for XC16x Family Adap_filter_16 16 bit real coefficients, Delayed LMS, Sample Processing (cont’d) Pseudo code { ; x(n) = input signal at time n ; d(n) = desired signal at time n ; y(n) = output signal at time n ; h(k,n) = k-th coefficient at time n ; gain = adaptive gain ; N = number of coefficient taps in the filter DataS DataS DataS DataS DataS x(n); d(n); h(k,n); y(n), Y; k; //input signal //desired signal //filter coefficient //output ;Calculate the output at time n y(n) = 0; for(k=0; k0; i--) temp = temp + (X(k)<>(15-i); //write the output into X X(k) = temp; } n = (DataS)log10(N); Y = n; return Y; //Filter Output returned } Techniques User’s Manual for Keil Compiler -139 V 1.2, 2003-11 DSP Library for XC16x Family Bit_reverse Assumptions Register Usage Memory Note Reverse the binary bit of the input data (cont’d) • 16 bit Input data vector X • N = 2n ( n < 12) • From .c file to .asm file: Defined by the compiler • From .asm file to .c file: (R4) = n (exponent of N) Memory High Addr. b15b14b13 ...... b2b1b0 b15b14b13 ...... b2b1b0 b15b14b13 ...... b2b1b0 . . . Low Addr. b15b14b13 ...... b2b1b0 b15b14b13 ...... b2b1b0 1Q15 Before bitreverse Figure 4-43 Bit_reverse Memory b0b1b2 ...... b13b14b15 X(N-1) b0b1b2 ...... b13b14b15 b0b1b2 ...... b13b14b15 . . . . . . b0b1b2 ...... b13b14b15 b0b1b2 ...... b13b14b15 X(0) 1Q15 After bitreverse Example Cycle Count C166Lib\Keil\Examples\FFT\real_FFT\bit_rev.c Exponent 6 determination Bit reverse: User’s Manual for Keil Compiler -140 V 1.2, 2003-11 Bit_reverse Code Size DSP Library for XC16x Family Reverse the binary bit of the input data (cont’d) if N=21 3 if N=22 , N=23 10 if N=24 , N=25 12 if N=26 , N=27 14 if N=28 , N=29 16 if N=210 , N=211 18 Restore state 3 Return 1 Total Exponent determination Bit reverse: if N=21 if N=22 , N=23 if N=24 , N=25 if N=26 , N=27 if N=28 , N=29 if N=210 , N=211 Restore state Return Total 12 bytes 6 bytes 20 bytes 24 bytes 28 bytes 32 bytes 36 bytes 6 bytes 2 bytes 304 bytes User’s Manual for Keil Compiler -141 V 1.2, 2003-11 DSP Library for XC16x Family FloatTo1Q15 Signature Inputs Output Return Implementation Description Pseudo code Assumption Register Usage Change the floating point format to fixed point format 1Q15 DataS FloatTo1Q15 ( float x) x : Input in float format : : Output in 1Q15 format This routine is a subroutine in the DIT FFT implementation packet and used to change the floating point data into a 1Q15 fixed point format. The floating point format has the structure: 1. word: s e e e e e e e e mmmmmmm 2. word: mmmmmmmmmmmmmmmm where s =sign, e=exponent, m=mantissa. After format change we have the 1Q15 fixed point data with the structure: s. b1 b2 b3 b4 ...... b15 sig. 2-1 2-2 2-3 2-4 ...... 2-15 For example: binary 0111 1111 1111 1111 0110 0000 0000 0000 1010 0000 0000 0000 1000 0000 0000 0000 hex 7FFF 6000 A000 8000 value +1 +0.75 - 0.75 -1 DPRAM begins with the address # f200h (it can be changed) • From .c file to .asm file: Defined by the Compiler • From .asm file to .c file: The output is stored in R4. User’s Manual for Keil Compiler -142 V 1.2, 2003-11 DSP Library for XC16x Family FloatTo1Q15 Memory Note Change the floating point format to fixed point format 1Q15 (cont’d) s e7 ... e0 m22 ... m16 m15 m14 m13 ... m2 m1 m0 Floating point format s b1 b2 ... b14 b15 sign 2-1 2-2 2-14 2-15 Fixed point format Figure 4-44 FloatTo1Q15 Example Cycle Count Code SIze C166Lib\Keil\Examples\Misc\Float_1Q15.c Read exponent Read mantissa Output Return Total Read exponent Read mantissa Output Return Total 10 7 12 1 30 20 bytes 14 bytes 20 bytes 2 bytes 56 bytes User’s Manual for Keil Compiler -143 V 1.2, 2003-11 DSP Library for XC16x Family real_DIT_FFT Signature Inputs Output Return Implementation Description Real Forward Radix-2 Decimation-in-Time Fast FourierTransformation void real_DIT_FFT ( DataS* DataS* DataS DataS* DataS* ) x, index, exp, table, X x : 16 bit real input vector index Bit reversed input index vector exp : Exponent of the input block size table : The precalculated trigonometric function (sinus and cosine) table X : The FFT output vector in 1Q15 format : This function computes the N-point real forward radix-2 decimation-in-time Fast Fourier Transform on the given Npoint real input array. The detailed implementation is given in the Section 4.7.2 . The function is implemented as a complex FFT of size N/2 followed by a unpack stage to unpack the real FFT results. Normally an FFT of a real sequence of size N produces a complex sequence of size N or 2N real numbers that will not fit in the input sequence. To accommodate all the results without requiring extra memory locations, the output reflects only half of the complex spectrum plus the spectrum at the nyquist point (N/2). This still provides the full information because an FFT of a real sequence has even symmetry around the nyquist point. User’s Manual for Keil Compiler -144 V 1.2, 2003-11 DSP Library for XC16x Family real_DIT_FFT Real Forward Radix-2 Decimation-in-Time Fast FourierTransformation (cont’d) Pseudo code { DataS* table; //sinus and cosine table DataS* x; //16 bit real input vector CplxS* X; //FFT output vector CplxS P(n), P(n+1), Q(n), Q(n+1), Y(N/2), H(N/2), G(N/2); DataS k,i,n; //building the complex sequences P(n) and Q(n) Q(n) = x(2n) + jx(2n+1); P(n) = x(2n+N/4) + jx(2n+N/4+1); //Outloop = 1 to exp=log2(N/2) for(k=0; k++; k0.5) x = 1-abs(x); //polynomial y = a(0)*x + a(1); y = y*x + a(2); y = y*x + a(3); y = y*x + a(4); y = y*x; //x in I/II quandrant if(sign == 0) y = y; //x in III/IV quandrant else if(sign == 1) y = -y; } Assumption User’s Manual for Keil Compiler -170 V 1.2, 2003-11 Sine Techniques Register Usage Memory Note Sine function (cont’d) • Use of CoMAC instructions • Instruction re-ordering • Defined by the compiler • From .asm file to.c file y is stored in R4 DSP Library for XC16x Family High Addr. Data Memory a(4) a(3) a(2) a(1) Low Addr. a(0) 4Q12 . . . . 5 ∑ a(i −1)xi i =1 . . x Input (R4)=y 1Q15 Output Figure 4-50 Sine Example C166Lib\Keil\Examples\Math\ Sinus.c Cycle Count Memory 1 Initialization Polynomial 26 User’s Manual for Keil Compiler -171 V 1.2, 2003-11 Sine Code Size DSP Library for XC16x Family Sine function (cont’d) Return 1 Total 28 Memory initialization Polynomial Return Total 2 bytes 88 bytes 2 bytes 92 bytes User’s Manual for Keil Compiler -172 V 1.2, 2003-11 DSP Library for XC16x Family P_series Signature Inputs Output Return Implementation Description N-th order power series DataS P_series( DataS* a, DataS* IN, DataS N) a : Pointer to the coefficient vector in 1Q15 format IN : Input value N : Size of series that must be even number Y : Series output in 1Q15 format The routine is implemented with CoMUL and CoADD instructions. To optimize the routine, the implementation uses "loop unrolling" technique and assumes that N is even. The implementation formula is: N ∑ y = a(i) ⋅ xi i=0 = [[[[a(N) ⋅ x + a(N – 1)]x + a(N – 2)]x + a(N – 3)]x + …] Pseudo code { DataS DataS DataS DataS a[N]; X; Y; i; Y = a(N); for(i=0; i|y|. Pseudo code Assumption Techniques Register Usage xy-- = xy-----××-----22---31--15- × 2–15 |x| should be less than |y| • From .c file to .asm file: Defined by the compiler • Output in R4 User’s Manual for Keil Compiler -179 V 1.2, 2003-11 DSP Library for XC16x Family div_q15 Division of two 1Q15 fractional inputs (cont’d) Memory Note Dividend x15 x14 ... x1 x0 Divisor y15 y14 ... y1 y0 Input DIVL r15 r14 ... r1 r0 R4 Output Figure 4-53 Division of two 1Q15 fractional inputs Example Cycle Count Code Size C166Lib\Keil\Examples\Math\ div_16.c Compute absolute 6 values Compute signs 8 Division 11 Return 1 Total 26 Compute absolute values Compute signs Division Return Total 20 bytes 24 bytes 18 bytes 2 bytes 64 bytes User’s Manual for Keil Compiler -180 V 1.2, 2003-11 DSP Library for XC16x Family div_q31 Division of 1Q31 fractional by 1Q15 fractional Signature Inputs Output Return Implementation Description DataS div_q31( DataL x, DataS y ) x : Dividend in 1Q31 format y : Divisor in 1Q15 format r : Result in 1Q15 format This function performs the division of two fractional inputs. The dividend is in1Q31 format and the divisor is in 1Q15 format.The result is saturated to +1 or -1 if |x|>|y|,which is in 1Q15 format. xy-- = xy-----××-----22---31--15- Pseudo code Assumption Techniques Register Usage • |x|should be less than |y| • Use of CoMUL instructions • From .c file to .asm file: Defined by the compiler • Output in R4 User’s Manual for Keil Compiler -181 V 1.2, 2003-11 DSP Library for XC16x Family div_q31 Division of 1Q31 fractional by 1Q15 fractional (cont’d) Memory Note Dividend x31 x30 ... x17 x16 Dividend x15 x14 ... x1 x0 DIVL Divisor y15 y14 ... y1 y0 Input r15 r14 ... r1 r0 R4 Output Figure 4-54 Division of 1Q31 fractional by 1Q15 fractional Example Cycle Count C166Lib\Keil\Examples\Math\div_32.c Compute absolute 8 values Compute signs 8 Division 11 Return 1 Total 28 Code Size Compute absolute values Compute signs Division 22 bytes 24 bytes 18 bytes User’s Manual for Keil Compiler -182 V 1.2, 2003-11 DSP Library for XC16x Family div_q31 Division of 1Q31 fractional by 1Q15 fractional (cont’d) Return 2 bytes Total 60 bytes User’s Manual for Keil Compiler -183 V 1.2, 2003-11 Sqrt Signature Inputs Output Return Implementation Description DSP Library for XC16x Family Square root DataS Sqrt( DataS x ) x : Input value in 1Q15 format in the range [0,1) y : Output value in 1Q15 format : The square root of the input value x can be calculated by using the following polynomial expansion Sqrt(x) = 1.454895(x) – 1.34491(x)2 + 1.106812(x)3 –0.536499(x)4 + 0.1121216(x)5 + 0.2075806 where,0.5<=x<=1 The coefficents of polynomial are stored in 2Q14 format.The square root table(table of scale factors)stores 1/sqrt(2^n) in 1Q15 format where n ranges from 0 to15. As the polynomial expansion needs input only in the range 0.5 to 1,the given input has to be scaled up.If the input value is less than 0.5,then it is scaled up by the powers of two,so that the scaled input value lies in the range 0.5 to 1.This scaled input is used for polynomial calculation.The calculated output is scaled down by 1/sqrt(2^n) to get the actual output.The obtained result is in 1Q15 format. User’s Manual for Keil Compiler -184 V 1.2, 2003-11 DSP Library for XC16x Family Sqrt Square root (cont’d) Pseudo code { DataS x; //input value DataS C; // Coeffcient DataS y; //output value DataS *scale factor; //scaling if(x>=0.5 && x<1) { y=((((c^5*x+c^4)x+c^3)x+c^2)x+c)x; //compute Square root y=y+0.2075806; //compute Square root } if(x>=0 && x<0.5) { x=x*scale factor; y=((((c^5*x+c^4)x+c^3)x+c^2)x+c)x; //compute Square root y=y+0.2075806; y=y/scale factor; } Assumption • Input is positive Techniques • Use of CoMUL instructions Register Usage • From .c file to .asm file: Defined by the compiler • From .asm file to .c file R4 = y Memory Note Example C166Lib\Keil\Examples\Math\ Squart.c Cycle Count Register 1 Initialization checking if x<=0 3 checking if x>0.5 3N compute square 17 root Multiplication with 5 scalar User’s Manual for Keil Compiler -185 V 1.2, 2003-11 Sqrt Code Size DSP Library for XC16x Family Square root (cont’d) Return 1 Total 27+3N Example: N=4 cycle=39 Register Initialization checking if x<0 checking if x>0.5 compute root square Multiplication with scalar Return 2 bytes 8 bytes 12 bytes 66 bytes 14 bytes 2 bytes Total 104 bytes User’s Manual for Keil Compiler -186 V 1.2, 2003-11 DSP Library for XC16x Family 4.10 Statistical Functions The following statistical functions are described. • Correlation Cross-correlation Autocorrelation • Convolution 4.10.1 Correlation 4.10.1.1 Definitions of Correlation Correlation determines the degree of similarity between two signals. If two signals are identical their correlation coefficient is 1, and if they are completely different it is 0. If they are identical by 180 phase shift between them, then the correlation coefficient is -1. There are two types of correlation, Cross-Correlation and Autocorrelation. When two independent signals are compared, the procedure is cross-correlation. When the same signal is compared to phase shifted copies of itself, the procedure is autocorrelation. Autocorrelation is used to extract the fundamental frequency of a signal. The distance between correlation peaks is the fundamental period of the signal. Suppose that N1 and N2 represent the size of input signals x1 and x2, respectively, and N=N1+N2 and N1 ≥ N2 . Extending x1 to N-point vector through adding N2-points of zero at the beginning, and x2 to N-point vector through adding N1-points of zero in the end, we can define the discrete cross-correlation as follows. Raw Cross-correlation: N1 + j – 1 ∑ r(j) = x1(i)x2(i + j) , –N2 + 1 ≤ j ≤ N1 – 1 i=0 Biased Cross-correlation: ∑ r ( j ) = N---1--1-- × N1 + j – 1 x1(i)x2(i + j) i=0 , –N2 + 1 ≤ j ≤ N1 – 1 [4.62] [4.63] Unbiased Cross-correlation: ∑ r(j) = N------–-----a-1--b---s---(--j--)- × N1 + j – 1x1(i)x2(i + j) i=0 , –N2 + 1 ≤ j ≤ N1 – 1 [4.64] User’s Manual for Keil Compiler -187 V 1.2, 2003-11 DSP Library for XC16x Family The above definitions contain the full-length cross-correlation of the real input signals x1 and x2 with N points output, which consists of N2 points of the negative-side and N1 points of the positive-side. If the input vectors x1 and x2 are same and equal to x with the size of N, we have the following definitions of the positive-side of the autocorrelation. Raw Autocorrelation: N–j–1 ∑ r(j) = x(i)x(i + j) i=0 Biased Autocorrelation: ∑ r(j) = N-1-- × N – j – 1 x(i)x(i + j) i=0 for j = 0 to Nr ≤ N – 1 for j = 0 to Nr ≤ N – 1 [4.65] [4.66] Unbiased Autocorrelation: ∑ r(j) = N-----1-–-----j × N – j – 1 x(i)x(i + j) i=0 for j = 0 to Nr ≤ N – 1 , [4.67] where j is the lag value, as it indicates the shift/lag considered for the r(j) autocorrelation. Nr is the correlation length and it determines how much data is used for each autocorrelation result. Note that the full-length autocorrelation of vector x will have 2N-1 points with even symmetry around the lag 0 point r(0). The above definitions define only the positive half for memory and computational savings. 4.10.1.2 Implementation Note Directly implementing the cross-correlation according to definitions in Equation [4.62] , Equation [4.63] and Equation [4.64] has difficulty due to the negative index. To make the algorithms realizable in assembly we need to rewrite the definitions. Raw Cross-correlation: User’s Manual for Keil Compiler -188 V 1.2, 2003-11 DSP Library for XC16x Family The negative-side can be rewriten as with positive index j ∑ r(j) = x1(i)x2(N2 – j – 1 + i) , i=0 and the positive-side as 0 ≤ j ≤ N2 – 1 N2 – 1 ∑ r(j + N2) = x1(i + j)x2(i) , i=0 0 ≤ j ≤ N1 – N2 – 1 N1 – j – 1 ∑ r(j + N2) = x1(i)x2(N2 – j – 1 + i) , i=0 N1 – N2 < j ≤ N1–1 . [4.68] [4.69] [4.70] Biased Cross-correlation: The negative-side: ∑ r(j) = N---1--1-- × j x1(i)x2(N2 – j – 1 + i) i=0 , 0 ≤ j ≤ N2 – 1 [4.71] The positive-side: ∑ r(j + N2) = N---1--1-- × N2 – 1 x1(i + j)x2(i) i=0 , 0 ≤ j ≤ N1 – N2 – 1 [4.72] ∑ r(j + N2) = N---1--1-- × N1 – j – 1 x1(i)x2(N2 – j – 1 + i) , N1 – N2 < j ≤ N1 – 1 [4.73] i=0 Unbiased Cross-correlation: The negative-side: ∑ r(j) = -a---b---s---(-1-j---+-----1---)- × j x1(i)x2(N2 – j – 1 + i) , i=0 User’s Manual for Keil Compiler -189 0 ≤ j ≤ N2 – 1 [4.74] V 1.2, 2003-11 DSP Library for XC16x Family The positive-side: ∑ r(j + N2) = N---1--2-- × N2 – 1 x1(i + j)x2(i) i=0 , 0 ≤ j ≤ N1 – N2 – 1 [4.75] ∑ r(j + N2) = N-----1-----–----1a---b----s--(---j--) × N1 – j – 1 x1(i)x2(N2 – j – 1 + i) , N1 – N2 < j ≤ N1 – 1 [4.76] i=0 4.10.2 Implementation Description The following routines are described. • Raw autocorrelation • Biased autocorrelation • Unbiased autocorrelation • Raw cross-correlation • Biased cross-correlation • Unbiased cross-correlation User’s Manual for Keil Compiler -190 V 1.2, 2003-11 DSP Library for XC16x Family Auto_raw Raw autocorrelation Signature Inputs Output Return Implementation Description Pseudo code { DataS *x; DataS *y; DataS N_x; DataS N_y; DataS i,j; void Auto_raw( ) DataS* DataS* DataS DataS x, y, N_x, N_y x : Pointer to input vector in 1Q15 format N_x : The length of input vector N_y (Nr) : The length of output vector y : Pointer to output vector in 1Q15 format : The function performs the positive side of the raw autocorrelation function of real vector x according to the definition in Equation [4.65] . The arguments to the function are pointer to the input vector, pointer to output buffer to store autocorrelation result, size of input buffer and number of auto correlated outputs desired. The input values and output values are both in 16 bit fractional format. //Ptr to input vector //Ptr to output vector //size of input //size of autocorrelation result for(j=0; j

Top_arrow
回到顶部
EEWORLD下载中心所有资源均来自网友分享,如有侵权,请发送举报邮件到客服邮箱bbs_service@eeworld.com.cn 或通过站内短信息或QQ:273568022联系管理员 高进,我们会尽快处理。