hiphoper 2004-7-23 12:38
对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
以前从来没研究过Linux,由于公司使用的操作系统是一个符合Posix标准的类Linux操作系统,根据上司的要求,要针对这个OS做了一个简单的分析报告。由于我对于Linux知之甚少,而且没有参考什么书籍(还没找到好的),一些名次可能用的不够准确,望各位不吝赐教。这是第一部分,在接下来的几周我还会继续完成接下来的部分。
[b]1.驱动程序的注册[/b]
(1)填充file_operation结构体
驱动程序包含五种基本的操作,它们是open(打开)、close(关闭)、read(读)、write(写)和ioctl(输入输出控制)。file_operation结构体包括open、close、read、write和ioctl五个成员,要用驱动程序的五个功能函数的入口地址来填充它们。以下是一个以太网驱动程序的实例。
[QUOTE]fopEtherDrvOps.open = EtherDrvOpen;
fopEtherDrvOps.close = EtherDrvClose;
fopEtherDrvOps.read = EtherDrvRead;
fopEtherDrvOps.write = EtherDrvWrite;
fopEtherDrvOps.ioctl = EtherDrvIoctl;[/QUOTE]
(2)在设备驱动列表中注册设备驱动程序
利用已填充的file_operation结构体,在设备信息列表中注册一个新的设备项。该过程通过调用register_chrdev()函数完成,该函数的函数原型如下:
[QUOTE]LONG register_chrdev(
DWORD major, // major device number
const CHAR *name, // device name string
struct file_operations *fops // pointer to file_operation struct
);[/QUOTE]
仍然以以太网驱动程序为例,相应代码如下:
[QUOTE]if (ENOERR != register_chrdev(ETHERDEV_MAJOR,
"EtherDrv",
&fopEtherDrvOps)) {
errno = EUKDEV;
return FAIL;
}[/QUOTE]
ETHERDEV_MAJOR是主设备号(major device number),通过它操作系统就可以为一个设备(一个INODE)找到它相应的设备驱动程序。
(3)添加设备节点
添加设备节点可以在注册驱动程序之前或之后进行。添加设备节点通过调用mknod()函数实现。该函数的原型如下:
[QUOTE]LONG mknod(
const CHAR *pathname, //device path name
mode_t mode, //node mode
dev_t dev //device number
);[/QUOTE]
以太网驱动程序注册一个新节点的代码如下,其中MKDEV是一个宏调用,用于将major device number和minor device number转换成一个唯一的设备号。
[QUOTE]mknod("/dev/ether/uplink", 0, MKDEV(ETHERDEV_MAJOR, i));[/QUOTE]
[b]2.NNOS的Posix输入输出函数的实现[/b]
2.1 高级输入输出函数
Posix的输入输出函数包括open()、close()、read()、write()和ioctl()。它们的原型如下:
[QUOTE]open(const char *_path, int nFlag, ...);
int close(int fd);
read(int fd, void *buf, unsigned len);
write(int fd, const void *buf, unsigned len);
int ioctl(int fd, int request, ...)[/QUOTE]
open()、close()和ioctl()函数是通过简单地封装DeviceOpen()、DeviceClose()和DeviceIoctl()函数来实现的。read()和write()函数进一步调用readwritev()函数。readwritev()函数通过封装DeviceRead()和DeviceWrite()函数从而将读、写操作统一成一个函数。
2.2 设备输入输出函数
DeviceOpen()、DeviceClose()、DeviceRead()、DeviceWrite()和DeviceIoctl()的函数原型如下:
[QUOTE]LONG DeviceOpen(const OCTET * pDevName, WORD wFlags);
LONG DeviceClose(LONG dwDevHandle);
LONG DeviceRead(LONG dwDevHandle, void *pBuf, size_t dwLen);
LONG DeviceWrite(LONG dwDevHandle, void *pBuf, size_t dwLen);
LONG DeviceIoctl(LONG dwDevHandle, LONG lRequest, va_list va_args);<!-- CETagParser ~/code
[/QUOTE]
2.2.1 DeviceOpen()
DeviceOpen()函数的功能是打开指定的设备,它调用DeviceOpenInternal函数打开设备指定的设备,DeviceOpenInternal函数完成以下工作:
(1)它调用stat函数获取欲打开设备对应的INODE结构体,INODE结构体包含了设备的主设备号等信息;
(2)若调用stat成功,则为file结构体(其存于pfFile)、file_operation结构体、INODE结构体和设备路径字符串分配内存,并填充pfFile->f_op、pfFile->f_inode、pfFile->f_inode->pucName,使它们指向所分配的内存的相应偏移处;
(3)在设备列表中,根据主设备号查找匹配的设备驱动程序,若查找成功,则将驱动程序的file_operation结构体复制到pfFile->f_op指向的内存块中,这样对设备的读、写等操作就可以通过pfFile->f_op来完成调用;
(4)通过pfFile->f_op->open调用驱动程序的open函数,若失败,则释放file结构体的内存;
(5)根据定义的格式计算出设备句柄作为函数返回值;
(6)添加一个新的设备记录(通过将pfFile保存在DevRec数组中)。
以下是NNOS的DeviceOpen()的代码:
[QUOTE]LONG
DeviceOpen(const OCTET * pDevName, WORD wFlags)
{
LONG lReturn;
BOOL bExit=FALSE;
OCTET* poName=(OCTET*)pDevName;
char chTempName[50];
// reset the path search list
DLLIST_head(&gdllPathList);
do {
if((strlen(pDevName) == 0)){
break;
}
lReturn=DeviceOpenInternal(poName,wFlags);
if (lReturn>=0) {
// success
CHECKPOINT(__LINE__);
return lReturn;
}
bExit=TRUE;
// if filename doesn't start with /
if (pDevName[0]!='/') {
// if path list exists
char* pchPath;
pchPath=DLLIST_read(&gdllPathList);
CHECKPOINT(__LINE__);
if (pchPath!=NULL) {
CHECKPOINT(__LINE__);
DLLIST_next(&gdllPathList);
// add path to the filename
strcpy(chTempName,pchPath);
strcat(chTempName,pDevName);
poName=chTempName;
// repeat loop to try open with new name
bExit=FALSE;
}
}
} while(!bExit);
CHECKPOINT(__LINE__);
return -ENOENT;
}[/QUOTE]
以下是DeviceOpenInternal()的代码:
[QUOTE]static LONG
DeviceOpenInternal(const OCTET * pDevName, WORD wFlags)
{
struct file *pfFile;
OCTET* poName=(OCTET*)pDevName;
INODE inTmp;
LONG lRtnVal;
if ((lRtnVal = stat(poName,&inTmp))) {
// No match found
CHECKPOINT(__LINE__);
return -ENOENT;
} else {
OCTET oOffsetOP = sizeof(struct file);
OCTET oOffsetNode = oOffsetOP + sizeof(struct file_operations);
OCTET oOffsetNodeName = oOffsetNode + sizeof(INODE);
pfFile = (struct file *) malloc(oOffsetNodeName + strlen(poName) + 1);
if (pfFile == NULL) {
CHECKPOINT(__LINE__);
return -ENOMEM;
} else {
pfFile->f_op = (struct file_operations*)((char*)pfFile + oOffsetOP);
pfFile->f_inode = (INODE*)((char*)pfFile + oOffsetNode);
bcopy(&inTmp,pfFile->f_inode,sizeof(INODE));
pfFile->f_inode->i_flags = wFlags;
pfFile->f_inode->pucName = (char*)((char*)pfFile + oOffsetNodeName);
strcpy(pfFile->f_inode->pucName,poName);
}
}
CHECKPOINT(__LINE__);
{
DEV_STRUCT *pdsDev;
pthread_mutex_lock(&mDeviceMutex);
/* find existing device with major number*/
ResetCursorQueue(qDevice);
while ((pdsDev = PeekCursorQueue(qDevice))) {
if (pdsDev->wMajor == MAJOR(pfFile->f_inode->i_rdev)) {
break;
} else {
NextCursorQueue(qDevice);
}
}
pthread_mutex_unlock(&mDeviceMutex);
if (NULL == pdsDev) {
free(pfFile);
return -EINVAL;
} else {
bcopy(pdsDev->pFops,pfFile->f_op,sizeof(struct file_operations));
}
}
{
LONG i;
LONG dwTmp;
dwTmp = pfFile->f_op->open(pfFile->f_inode,pfFile);
if (dwTmp < 0) {
CHECKPOINT(__LINE__);
free(pfFile);
return dwTmp;
}
/* New format of the return value: */
/* */
/* [7 - 0] Fd in Dev_Rec[] */
/* [15 - 8] Device Major */
/* [30 - 16] Return value from f_op->open() */
/* - negative number is error */
/* - host returns Stream Fd */
/* - other devices return ENOERR */
lRtnVal = (dwTmp << 16) & 0x7fff0000;
lRtnVal |= (((MAJOR(pfFile->f_inode->i_rdev)) << 8) & 0xff00);
pthread_mutex_lock(&mDeviceMutex);
for(i=0; i < OPEN_MAX; i++) {
CHECKPOINT(__LINE__);
if (!DevRec[ i ].dwBusy) {
lRtnVal |= (i & 0xff);
pfFile->f_desc = lRtnVal;
DevRec[ i ].dwBusy = 0x1;
DevRec[ i ].pfFile = pfFile;
CHECKPOINT(__LINE__);
pthread_mutex_unlock(&mDeviceMutex);
return (lRtnVal);
}
}
pthread_mutex_unlock(&mDeviceMutex);
CHECKPOINT(__LINE__);
free(pfFile);
CHECKPOINT(__LINE__);
}
CHECKPOINT(__LINE__);
ASSERT(0); // not enough device records !
return -ENOMEM;
}[/QUOTE]
2.2.2 DeviceClose()
DeviceClose()的功能是关闭一个打开的设备。它封装了下层驱动程序的close()函数。它的具体工作是:
(1)根据设备句柄计算设备记录号(通过与DEVICE_NMBR_MASK进行与操作);
(2)调用设备驱动程序的close()函数关闭设备;
(3)从设备记录中删除设备(通过将设备记录号指出的DevRec数租元素的dwBusy置为0)
2.2.3 DeviceRead()、DeviceWrite()和DeviceIoctl()
这三个函数完成对设备的读、写和控制操作。它们的实现方法基本相同,都是将下层驱动程序提供的输入输出函数(包括read()、write()、ioctl())进行封装。它们的完成的工作如下:
(1)根据设备句柄计算设备记录号(通过与DEVICE_NMBR_MASK进行与操作);
(2)调用设备驱动程序的输入输出函数并以这些函数的返回值作为自己的返回值。
[[i] 本帖最后由 .crystal. 于 2007-4-23 17:37 编辑 [/i]]
hiphoper 2004-11-9 11:25
Re:对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
继续我的分析^_^
Device drivers of NNOS do not handle interrupts directly, as it is in Linux. Instead, the OS kernel does the job. The kernel keeps a table containing the interrupt handlers of each device driver and the interrupt mask by which OS can determine which interrupt handler should handle a interrupt. The interrupt handlers are registered by device driver with RegisterISR() function provided by OS. When a interrupt occurs, the 1st level interrupt handler handles it and will look up in the table. If a matched entry found, it will generate a 1st level interrupt (software interrupt, actually it simply calls the interrupt handler of a device driver).
The ISR installation procedure of NNOS (InstallISR() of sw\src\os\nnos\hardware\src\Init.c)
[QUOTE]/*************************************************************************
* Function Name: InstallIsr
* Description: install and enable the interrupt handler and install
* the first level interrupt handlers.
* Input: None
* Output: None
* Returns: None
*
*************************************************************************/
/* This also clears out the IV Tbl. WARNING: this clears riface_irqmask */
static void InstallIsr(void)
{
/* install the main ISR and enable interrupts */
DWORD *pdwTgt;
DWORD *pdwSrc;
IV_ENTRY *pivEntry;
register int i; /* MUST be a register for interrupt enabling */
DWORD dwIrqMask = 0;
#ifndef NDEBUG
/* check that low memory stuff does not impinge on the data segment */
extern WORD _dstart;
if ((OCTET *) &_dstart < (OCTET *) pdwLowMemEnd)
INIT_FAIL;
#endif
/* clear low memory, from addr 0 thru the trap handlers */
/* THIS OVERWRITES PART OF MEMORY!!!!!!!!!!! */
pdwTgt = (DWORD *) 0;
for (i = 0; i < (LOW_MEM_CLEAR_AND_TRAP_AREA_LEN)/sizeof(DWORD); i++)
*pdwTgt++ = 0x50000001; /* "ld 1[r0],r0" force alignment exception */
#ifndef HARDWARE_NOKNL
/* copy the "jump-to-kernel-trap-handler" code to low memory
this is the trap jump for NNOS */
/* THIS OVERWRITES PART OF MEMORY!!!!!!!!!!! */
#ifndef NDEBUG
if (LOW_MEM_KERNEL_TRAP_LEN <
4 * ((OCTET *) JumpToKernelTrap_End - (OCTET *) JumpToKernelTrap))
INIT_FAIL;
#endif
pdwTgt = pdwLowMemKernelTrap;
/* convert text to data addr */
pdwSrc = (DWORD *) ((DWORD)JumpToKernelTrap << 2);
for (i = 0;
i < ((OCTET *) JumpToKernelTrap_End - (OCTET *) JumpToKernelTrap); i++)
*pdwTgt++ = *pdwSrc++;
#endif
/* copy the "jump-to-gdb-trap-handler" code to low memory
this is the trap jump for NNOS */
/* THIS OVERWRITES PART OF MEMORY!!!!!!!!!!! */
#ifndef NDEBUG
if (LOW_MEM_GDB_TRAP_LEN <
4 * ((OCTET *) JumpToGdbTrap_End - (OCTET *) JumpToGdbTrap))
INIT_FAIL;
#endif
pdwTgt = (DWORD *) pdwLowMemGdbTrap;
/* convert text to data addr */
pdwSrc = (DWORD *) ((DWORD) JumpToGdbTrap << 2);
for (i = 0;
i < ((OCTET *) JumpToGdbTrap_End - (OCTET *) JumpToGdbTrap); i++)
*pdwTgt++ = *pdwSrc++;
/* clear the IV Tbl */
/* THIS OVERWRITES PART OF LOW MEMORY!!!! */
*pdwLowMemIsrCnt = 0;
pivEntry = pivLowMemIvTbl;
for (i = 0; i < NUM_IV_ENTRIES; i++) {
pivEntry->pIsr = NoVector;
pivEntry++;
}
/* clear second level Dma and Host interrupts */
bzero(pHostIvTbl, NUM_LEVEL2_IV_ENTRIES * sizeof(LEVEL2_IV_ENTRY));
bzero(pDmaIvTbl, NUM_LEVEL2_IV_ENTRIES * sizeof(LEVEL2_IV_ENTRY));
/* gdb handles hardware breakpoints and alignment excps */
#ifndef HARDWARE_NOKNL
/* register the first level interrupt handlers and enable them */
RegisterIsr(RI_IRQ_HOST_RX_MISCDMA_STATUS | RI_IRQ_HOST_TX_MISCDMA_STATUS, miscDmaIsr);
PUT_KODIAC_REG(misc_dma_donemask, 0);
dwIrqMask |= RI_IRQ_HOST_TX_MISCDMA_STATUS | RI_IRQ_HOST_RX_MISCDMA_STATUS | RI_IRQ_HOST_STATUS;
#endif
/* JAE ri_irq_base defaults to 0x4000 so reset it ... */
PUT_KODIAC_REG(ri_irq_mask, dwIrqMask);
PUT_KODIAC_REG(ri_irq_base, ((DWORD)JumpToIsr << 2));
/* enable the interrupts */
i = PSW_ENABLE_IRQ; /* enable interrupts */
asm volatile("movtos %0,psw" : :"r" (i) ); /* in the PSW */
}[/QUOTE]
1. clear low memory after making sure it doesn’t overlap with data segment, from address 0 thru the trap handlers (line 190-201)
2. copy the trap-handler code to low memory (line 203-233)
3. clear 1st and 2nd interrupt handler table (line 235-246)
4. register and enable 1st level interrupt handler by calling RegisterIsr(), miscDmaIsr() is the 1st label interrupt handler (line 250-262)
The ISR registration procedure of NNOS (RegisterIsr() of sw\src\os\nnos\isr\src\Lvl1isr.c)
[QUOTE]/*
* register an ISR with a particular riface_irqmask value
*
* parms:
* dwMask mask to identify the IRQ from riface_irqstatus
* pIsr ISR routine address
*
* returns: OK success
* ISR_TBL_FULL IV Tbl Full
* ISR_CONFLICT mask conflicts with an already registered ISR
*/
int
RegisterIsr(DWORD dwMask, void (*pIsr)(IRQ_REGS *))
{
#ifndef _AUDACITYT2
register IV_ENTRY *pivEntry;
#endif
#ifdef _AUDACITYT2
{
DWORD i;
for (i = 0; i < RI_IRQ_MASK_NBITS; i++) {
if (dwMask & (1 << i)) {
if (pivLowMemIvTbl[i ].pIsr != NoVector)
return ISR_CONFLICT;
else
pivLowMemIvTbl[i ].pIsr = pIsr;
}
}
}
return(OK);
#else
/* check for mask conflicts */
for (pivEntry = pivTbl; pivEntry < pivTblEnd; pivEntry++)
if (pivEntry->dwMask & dwMask)
return ISR_CONFLICT;
for (pivEntry = pivTbl; pivEntry < pivTblEnd; pivEntry++)
if (IV_AVAIL == pivEntry->dwMask) {
pivEntry->dwMask = dwMask;
pivEntry->pIsr = pIsr;
return OK;
}
return ISR_TBL_FULL; /* none avail!! */
#endif
}[/QUOTE]
1. check for ISR and interrupt mask conflict (line 46-65)
2. add the interrupt mask and the corresponding ISR entry address in the interrupt handler table
The 1st interrupt handler (miscDmaIsr() of sw\src\os\nnos\hardware\src\dmaisr.c)
[QUOTE]/* NOTE: since the specific DMA that triggered this interrupt is using irqs,
there's no need to clear the appropriate bit in the rdma_done register */
void miscDmaIsr(IRQ_REGS *pIrqRegs)
{
register DWORD dwDmaDoneMask;
register DWORD dwDmaDone;
dwDmaDoneMask = GET_KODIAC_REG(misc_dma_donemask);
dwDmaDone = GET_KODIAC_REG(misc_dma_done);
GenericLvl1Isr(pIrqRegs, dwDmaDoneMask & dwDmaDone, pDmaIvTbl);
}[/QUOTE]
[QUOTE]void
GenericLvl1Isr(IRQ_REGS *pIrqRegs, DWORD dwStatus, LEVEL2_IV_ENTRY *pLvl2)
{
LEVEL2_IV_ENTRY *pEnd;
/* find and call the 1st IV in the IV Tbl that matches this interrupt */
for (pEnd = pLvl2 + NUM_LEVEL2_IV_ENTRIES; pLvl2 < pEnd; pLvl2++)
if (dwStatus & pLvl2->dwMask) {
(*(pLvl2->pIsr))(pIrqRegs);
return; /* done */
}
/* An error not to find a match */
(volatile int) dwIsrHangInfo = dwStatus;
ISR_HANG;
}[/QUOTE]
1. fetch content of dma_donemask and dma_done (line 27-28)
2. generate a 1st level interrupt by calling GenericLvl1Isr (), which looks up in the 2nd level interrupt handler table and finds the corresponding the entry that matches with the interrupt mask and calls the 2nd level interrupt handler. 2nd level interrupt handlers are usually registered during the initializing phase of device drivers (line 29)
[[i] 本帖最后由 .crystal. 于 2007-4-23 17:39 编辑 [/i]]
一代宗师 2005-4-30 15:33
Re:对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
唉,看不懂,要不你给推荐本分析linux内核的好书吧?
冰山一角 2005-5-1 18:11
Re:对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
感觉标题应该叫驱动框架分析。。。不过楼主居然没参考什么资料就写出这么强的文章来了,真是让人无比敬佩。
ljy545 2006-8-15 14:53
Re:对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
hiphoper :
你好!
我也正在搞nnos,请问你是那个公司的??
怎么联系???
traceroute 2007-2-23 22:11
Re:对NNOS(一个符合Posix标准的类Linux操作系统)的简单分析
果然,checkpoint。符号命名风格也是正规公司里的。不过我不习惯这种风格。看起来也不完全是内核空间的。