百度空间 | 百度首页 
 
查看文章
 
PrograminWDM_Part43- -
2009-05-17 13:36

http://xiaomaier.bokee.com/3352204.html



String Handling

字符串处理

WDM drivers can work with string data in any of four formats:

WDM驱动可以工作于四种格式的数据中的任何一种:

  • A Unicode string, normally described by a UNICODE_STRING structure, contains 16-bit characters. Unicode has sufficient code points to accommodate the language scripts used on this planet. A whimsical attempt to standardize code points for the Klingon language, reported in the first edition, has been rejected. A reader of the first edition sent me the following e-mail comment about this:
  • 一个Unicode字符串,通常由一个UNICODE_STRING结构说明,包含16位字符。Unicode具有充分的代码点,提供用在这个设备上的语言脚本。一个古怪的试图为Klingon语言标准化的代码点,在第一版中提到的,已经被否决了。一个第一版的读者给我发来了下面注释的e-mail:

I suspect this is rude, and possibly obscene.

我觉得这很粗鲁,并且可能很猥亵。

  • An ANSI string, normally described by an ANSI_STRING structure, contains 8-bit characters. A variant is an OEM_STRING, which also describes a string of 8-bit characters. The difference between the two is that an OEM string has characters whose graphic depends on the current code page, whereas an ANSI string has characters whose graphic is independent of code page. WDM drivers won’t normally deal with OEM strings because they would have to originate in user mode, and some other kernel-mode component will have already translated them into Unicode strings by the time the driver sees them.
  • 一个ANSI字符串,通常由一个ANSI_STRING结构来说明,包含8位字符。一个变量是一个OEM_STRING,它还说明一个8位字符的字符串。这两者之间的区别是:一个OEM字符串具有的字符,其图形以来当前代码页,因此,一个ANSI字符串具有的字符,其图形不依赖于代码页。WDM驱动通常不涉及OEM字符串,因为它们无需发生在用户模式中,并且以写其他的内核模式成分已经在驱动看到它们的时候把它们解释成了Unicode字符串。
  • A null-terminated string of characters. You can express constants using normal C syntax, such as “Hello, world!” Strings employ 8-bit characters of type CHAR, which are assumed to be from the ANSI character set. The characters in string constants originate in whatever editor you used to create your source code. If you use an editor that relies on the then-current code page to display graphics in the editing window, be aware that some characters might have a different meaning when treated as part of the Windows ANSI character set.
  • 一个null终止字符的字符串。你可以使用普通C语法,比如“世界,你好!”,表示变量。使用8位CHAR类型的8位字符的字符串,它假定是来自ANSI字符设置的。在字符串中的字符包含原来的编辑器中,无论你用来创建你的源码所使用的是什么编辑器。如果你使用的编辑器以来当前代码页来在编辑窗口中显示图形,要知道:当作为Windows ANSI字符设置的一部份的时候,一些字符可能具有不同的含义。
  • A null-terminated string of wide characters (type WCHAR). You can express wide string constants using normal C syntax, such as L"Goodbye, cruel world!" Such strings look like Unicode constants, but, being ultimately derived from some text editor or another, actually use only the ASCII and Latin1 code points (0020-007F and 00A0-00FF) that correspond to the Windows ANSI set.
  • 一个null终止的长字符(WCHAR类型)的字符串。你可以使用普通C语法,比如“永别了,残酷的世界!”来表示长字符变量。这样的字符串看起来像Unicode常量,但是,最终起源于一些文本编辑器或者其他的什么,实际上只使用ASCII和Latin1代码点(0020-007F和00A0-00FF),它符合Windows ANSI设置。

The UNICODE_STRING and ANSI_STRING data structures both have the layout depicted in Figure 3-13. The Buffer field of either structure points to a data area elsewhere in memory that contains the string data. MaximumLength gives the length of the buffer area, and Length provides the (current) length of the string without regard to any null terminator that might be present. Both length fields are in bytes, even for the UNICODE_STRING structure.

UNICODE_STRING和ANSI_STRING数据结构都具有在图3-13中描述的布局。每一个结构点的缓冲域执行的数据区,在别处的内存中也包含这个字符串数据。MaximumLength给定缓冲区域的长度,并且这个长度提供(当前)不含可能出现的视为null终止符的字符串的长度。两个长度域都在字接种,即使对于UNICODE_STRING结构。

Figure 3-13. The UNICODE_STRING and ANSI_STRING structures.

3-13。结构UNICODE_STRING和ANSI_STRING

The kernel defines three categories of functions for working with Unicode and ANSI strings. One category has names beginning with Rtl (for run-time library). Another category includes most of the functions that are in a standard C library for managing null-terminated strings. The third category includes the safe string functions from strsafe.h, which will hopefully be packaged in a DDK header named NtStrsafe.h by the time you read this. I can’t add any value to the DDK documentation by repeating what it says about the RtlXxx functions. I have, however, distilled in Table 3-9 a list of now-deprecated standard C string functions and the recommended alternatives from NtStrsafe.h.

内核为使用Unicode和ANSI字符串定义了三种函数。一种以Rtl开头名称命名(对于运行时间库)。另一种包含在标准C库中用于管理null终止字符串的大部分函数。第三种包含来自strsafe.h的安全字符串函数,在你读到本文的时候,它将有希望被打包进入DDK头文件中叫做NtStrsafe.h(里面)。我不能通过重复DDK文档关于RtlXxx函数所描述的(内容)来添加任何值到DDK中。但是,我已经从NtStrsafe.h中提取了一个当前反对标准字符串函数和推荐的可选项列表到表3-9中。

3-9。对于字符串操作的可靠函数

标准函数(反对)

可靠的UNICODE可选项

可靠的ANSI可选项

strcpy, wcscpy, strncpy, wcsncpy

RtlStringCbCopyW, RtlStringCchCopyW

RtlStringCbCopyA, RtlString­CchCopyA

strcat, wcscat, strncat, wcsncat

RtlStringCbCatW, RtlStringCchCatW

RtlStringCbCatA, RtlString­CchCatA

sprintf, swprintf, _snprintf, _snwprintf

RtlStringCbPrintfW, RtlStringCchPrintfW

RtlStringCbPrintfA, RtlStringCchPrintfA

vsprintf, vswprintf, vsnprintf, _vsnwprintf

RtlStringCbVPrintfW, RtlStringCchVPrintfW

RtlStringCbVPrintfA, RtlStringCchVPrintfA

strlen, wcslen

RtlStringCbLengthW, RtlStringCchLengthW

RtlStringCbLengthA, RtlStringCchLengthA

NOTE

注:

I based the contents of Table 3-9 on a description of how one of the kernel developers planned to craft NtStrsafe.h from an existing user-mode header named strsafe.h. Don’t trust me—trust the contents of the DDK!

我基于表3-9的内容来说明一个内核开发者如何计划来从一个存在的叫做strsafe.h的用户模式头文件中制作NtStrsafe.h。不要相信我----相信DDK的内容!

It’s also okay, but not idiomatic, to use memcpy, memmove, memcmp, and memset in a driver. Nonetheless, most driver programmers use these RtlXxx functions in preference:

在一个驱动中使用memcpy,memmove,memcmp和memset也行,但是不常用。虽然如此,很多驱动程序员偏爱使用这些RtlXxx函数:

  • RtlCopyMemory or RtlCopyBytes instead of memcpy to copy a “blob” of bytes from one place to another. These functions are actually identical in the current Windows XP DDK. Furthermore, for Intel 32-bit targets, both are macro’ed to memcpy, and memcpy is the subject of a #pragma intrinsic, so the compiler generates inline code to perform it.
  • RtlCopyMemory或者RtlCopyBytes代替memcpy来从一个地方复制一个字节“水滴”到另一个地方。这些函数实际上等同于在当前Windows XP的DDK中的文件。此外,对于Intel32位对象,macro’ed对于memcpy和memcpy是#pragma内在的主题,所以这个编译器产生了内嵌代码来执行它。
  • RtlZeroMemory instead of memset to zero an area of memory. Rtl­ZeroMemory is macro’ed to memset for Intel 32-bit targets, and memset is mentioned in a #pragmaintrinsic.
  • RtlZeroMemory代替memset来清零一个内存区域。RtlZeroMemory是Intel32位对象对于macro’ed的memset,memset在#pragmaintrinsic被涉及。

You should use the safe string functions in preference to standard run-time routines such as strcpy and the like. As I mentioned at the outset of this chapter, the standard string functions are available, but they’re often too hard to use safely. Consider these points in choosing which string functions you’ll use in your driver:

你应该优先选择使用可靠的字符串函数来标准化运行时间程序,比如strcpy什么的。就像我在这章一开始提到的,标准的字符串函数很有用,但是它们通常很难可靠的使用。在选择你将在你的驱动重视用哪些字符串函数(的时候)考虑以下方面:

  • The uncounted forms strcpy, strcat, sprintf, and vsprintf (and their Unicode equivalents) don’t protect you against overrunning the target buffer. Neither does strncat (and its Unicode equivalent), wherein the length argument applies to the source string.
  • 不可数的形式strcpy,strcat,sprintf,和vsprintf(和它们的Unicode等价物)不保护你免于溢出这个目标缓冲区。strncat(和它的Unicode等价物)也不(进行这种包护),其中目标长度依赖于源字符串。
  • The strncpy and wcsncpy functions will fail to append a null terminator to the target if the source is at least as long as the specified length. In addition, these functions have the possibly expensive feature of filling any leftover portion of the target buffer with nulls.
  • strncpy和wcsncpy函数将不能添加一个null终止到这个目标,只要这个源至少和指定的长度等长。另外,这些函数可能有昂贵的特征:用null来填充目标的任何剩余的部分。
  • Any of the deprecated functions has the potential to walk off the end of a memory page looking in vain for a null terminator. This trait makes them especially dangerous when dealing with string data coming to you from user mode.
  • 这些反对的函数中的任何一个都具有潜在的走到内存页的末端徒然的查找一个null终止。这个特性使得它们在同用户模式继承给你的字符串数据打交道的时候特别危险。
  • As I write this, NtStrsafe.h doesn’t currently define any comparison functions (strcmp, etc.). Keep your eye on the DDK for such functions. Note that case-insensitive comparisons of ANSI strings are tricky because they depend on localization settings that can vary from one session to another on the same computer.
  • 在我编写这个的时候,NtStrsafe.h没有普遍的定义任何比照函数(strcmp等)。关注DDK中的这些函数。注意ANSI字符串的感觉迟钝的情况比照,它们很狡猾,因为它们基于本地的设置----在相同的计算机上它们可以从一个情况变成另一个情况。

Allocating and Releasing String Buffers

字符串缓冲区的分配和释放

You often define UNICODE_STRING (or ANSI_STRING) structures as automatic variables or as parts of your own device extension. The string buffers to which these structures point usually occupy dynamically allocated memory, but you’ll sometimes want to work with string constants too. Keeping track of who owns the memory to which a particular UNICODE_STRING or ANSI_STRING structure points can be a bit of a problem. Consider the following fragment of a function:

通常你定义UNICODE_STRING(或者ANSI_STRING)机构作为自动的变量或者作为你自己的设备扩展名的一部分。这些函数执行的这个字符串缓冲区通常动态执行内存分配,但是有时候你还想要同字符串常量打交道。保持跟踪谁拥有这块内存,一个特别的UNICODE_STRING或者ANSI_STRING结构点可能是一点问题。参考下面函数的(代码)片断:

UNICODE_STRING foo;

if (bArriving)

   RtlInitUnicodeString(&foo, "Hello, world!");

else

   {

   ANSI_STRING bar;

   RtlInitAnsiString(&bar, "Goodbye, cruel world!");

   RtlAnsiStringToUnicodeString(&foo, &bar, TRUE);

   }

RtlFreeUnicodeString(&foo); // <== don't do this!

In one case, we initialize foo.Length, foo.MaximumLength, and foo.Buffer to describe a wide character string constant in our driver. In another case, we ask the system (by means of the TRUE third argument to RtlAnsiStringToUnicodeString) to allocate memory for the Unicode translation of an ANSI string. In the first case, it’s a mistake to call RtlFreeUnicodeString because it will unconditionally try to release a memory block that’s part of our code or data. In the second case, it’s mandatory to call RtlFreeUnicodeString eventually if we want to avoid a memory leak.

在一种情况下,我们初始化foo.Length,foo.MaximumLength,和foo.Buffer来说明一个在我们的驱动重的长字符的字符串串常量。在另一种情况下,我们请求系统(依靠RtlAnsiStringToUnicodeString 的第三个理由的TRUE)来为Unicode转换成一个ANSI串分配内存。在第一种情况中,调用RtlFreeUnicodeString是个错误,因为它竟无条件的长时释放一个内存块,这个内存块石我们的代码或者数据的一部分。在第二种情况中,如果我们想要避免一个内存漏洞,(就要)强制最终调用RtlFreeUnicodeString。

The moral of the preceding example is that you have to know where the memory comes from in any UNICODE_STRING structures you use so that you can release the memory only when necessary.

之前的例子道德上是你不得不了解:在任何你所使用的UNICODE_STRING结构中的内存来自于哪里,以便只有在必要的时候你可以释放这些内存。


类别:学习 | 添加到搜藏 | 浏览() | 评论 (1)
 
最近读者:
 
网友评论:
1
2009-11-03 08:34 | 回复
为什么辛苦翻译半天要把图丢掉呢?
 
发表评论:
姓 名:
网址或邮箱: (选填)
内 容:
验证码: 请点击后输入四位验证码,字母不区分大小写
      

     

©2009 Baidu