OVS是open virtual switch的簡稱,是現在廣泛使用的軟件實現的虛擬網絡交換機。
各大云廠商普遍使用OVS來實現自身的虛擬網絡,各廠商會根據自身需要加以修改使之符合自身需求,DPU中也使用OVS來實現流表的offload。OVS中的流表基于多級結構,與用戶強相關的是opwnflow,下發的流表稱為emc flow。
OVS一般存在兩種運行模式,內核模式和DPDK模式。內核模式下存在一個datapath的內核模塊,模塊會在內核層面維護一份emc flow,數據包從網卡接收到時首先在datapath中進行emc flow的匹配,如果無匹配結果,則通過upcall機制交由核心組件ovs-vswitchd處理。ovs-vswitchd收到upcall的數據包時首先根據數據包查詢是否存在emc flow,如果存在則根據emc flow的內容對數據包處理,如果不存在則對數據包匹配openflow并根據匹配結果創建emc flow,并將其下發至datapath中,這一過程也稱為offload。ovs-vswitchd主要工作就是對收到的數據包進行匹配,根據匹配結果對數據包進行處理并轉發。
關于OVS的整體結構,openflow的結構等會另開文章詳細介紹,本文主要分析的內容就是OVS的核心組件ovs-vswitchd的整體代碼結構,由于該組件結構十分復雜,數據結構也比較多,故此文章會隨時間的推移不斷的更新,如果有不足之處請各位評論留言。
代碼結構簡單時,通過縮進的方式表示各function的隸屬關系;代碼結構復雜時,則通過層級編號+縮進的方式表示。
ovs-vswitchd的主函數
main()->bridge_init()->lacp_init() //命令注冊->bond_init() //命令注冊->......->ovs_numa_init()->while true //無線循環執行->memory_run() //內存處理->memory_init()->bridge_run() //網橋處理->netdev_run() //網絡設備->memory_wait()->bridge_wait()->netdev_wait()
bridge_run()
該func是主要處理數據包的,結構比較復雜,會拆分比較細致
bridge_run()->ovsdb_idl_run() // 連接ovsdb->if_notifier_run()->rtnetlink_run()->nln_run() //通過netlink獲取interface變化情況->ovsrec_open_vswitch_first() //獲取ovsdb配置->dpdk_init()->bridge_init_ofproto() //初始化網橋上的openflow->bridge_run__()->ofproto_enumerate_types() //ofproto_class目前只有ofproto_dpif,故枚舉的就是dpif_classes中的類型,目前是system/netlink和netdev->ofproto_type_run() //執行ofproto_dpif->type_run()->ofproto_run() //對于每一個網橋,執行該func->bridge_reconfigure() //網橋信息重新配
dpdk_init()
dpdk_init()->dpdk_init__()->construct_dpdk_args() //dpdk參數->rte_eal_init()->netdev_dpdk_register() //->netdev_register_flow_api_provider() //調用該函數的有tc,dpdk,dummy
bridge_init_ofproto()
bridge_init_ofproto()-> 判斷initialized,該func只執行一次-> //根據cfg配置,先遍歷一遍網橋,獲取bridge和port結構,放在iface_hints里面->ofproto_init() //對iface_hints的處理->ofproto_class_register(&ofproto_dpif_class) // 注冊dpif類型的ofproto_class,存放在ofproto_classes結構體,n_ofproto_classes記錄數量,結構體會自己擴容->iface_hints的結構存成init_ofp_ports->ofproto_classes[i]->init(init_ofp_ports) //執行ofproto_class的init,這里是ofproto_dpif_class->ofproto_unixctl_init() //注冊一些命令// +1
init() // ofproto_dpif_class-> // 傳入的參數添加到init_ofp_ports中->ofproto_unixctl_init/ofproto_dpif_trace_init/udpif_init() // 注冊一些unixctl接口// +2
ofproto_unixctl_init()->//注冊ofproto_unixctl_dpif_dump_flows->ofproto_dpif_lookup_by_name() // 根據網橋名稱查找->unixctl_command_reply() // 往conn寫即為回復信息
bridge_run->ofproto_enumerate_types()
ofproto_enumerate_types()
ofproto_enumerate_types()->ofproto_classes[i]->enumerate_types // 以ofproto_dpif_class為例->dp_enumerate_types()
ofproto_type_run()
ofproto_type_run() // ofproto_dpif->ofproto_normalize_type()->ofproto_class_find__()->ofproto_class->type_run()
type_run()
type_run() // ofproto_dpif->shash_find_data() // 找到dpif_backer->dpif_run() // 執行成功之后,need_revalidate需要置位->dpif->dpif_class->run() //dpif_netdev_run->udpif_run() // upcall dpif,unixctl消息處理-> // 特定條件下開啟從datapath收包->dpif_recv_set()->need_revalidate需要置位->udpif_set_threads()->udpif_start_threads()->ovs_thread_create(udpif_upcall_handler)->ovs_thread_create(udpif_revalidator)->backer->need_revalidate //需要重新生效 下面有
dpif_netdev_run()
// 4
dpif_netdev_run()->dp_netdev_process_rxq_port() //收,解釋下面有->dp_netdev_pmd_flush_output_packets() //發->reconfigure_datapath() // 后面還有解釋->reconfigure_pmd_threads() //根據port變化,動態變更線程->ovs_thread_create(pmd_thread_main) // 線程創建->pmd_thread_main() //里面有個無線循環在收和發->ovs_numa_thread_setaffinity_core() //CPU親和性設定->dp_netdev_process_rxq_port() //處理收->cycle_timer_start() //記錄處理時間->dp_packet_batch_init() //batch結構初始化->netdev_rxq_recv() //收->dp_netdev_input() //處理,此處的pkt metadata是無效的->dp_netdev_input__()->dp_netdev_pmd_flush_output_packets()->dp_netdev_pmd_flush_output_packets() //發// 5
dp_netdev_input__() //包處理->dfc_processing() // datapath flow cache,針對每個包的處理->parse_tcp_flags() // flow director找到對應的flow,然后會解析數據包的2 3 4層信息,tcp標簽->dp_netdev_queue_batches() // 根據解析出的數據表信息,放入batch里面或者flow map里面->packet_batch_per_flow_init()->packet_batch_per_flow_update()->packet_enqueue_to_flow_map()->miniflow_extract() //提取miniflow ->emc_lookup()->netdev_flow_key_equal_mf() //匹配flow,就是memcmp->fast_path_processing()->dp_netdev_pmd_lookup_dpcls()->dpcls_lookup()->dp_netdev_pmd_lookup_flow()->handle_packet_upcall()->dp_netdev_upcall() // 傳入兩個action,一個是正常的流表中的,一個貌似是被解析過的->odp_flow_key_from_flow()->dp_netdev->upcall_cb() //netdev有,是通過注冊設置的func,在下面有詳細解釋->dp_netdev_execute_actions() //執行各action,最后可能會將數據包放到隊列,下面有詳細解釋->dp_netdev_pmd_lookup_flow()->dp_netdev_flow_add() // 插入dp_netdev_flow->dp_netdev_pmd_find_dpcls()->dpcls_insert()->dpcls_find_subtable()->dpcls_create_subtable()->dp_emc_flow_add() // 移動->queue_netdev_flow_put()->dp_netdev_queue_batches()->packet_batch_per_flow_execute()->dp_netdev_execute_actions() //下面有詳細解釋->dp_emc_flow_add()->queue_netdev_flow_put() //解釋下面有// 6
dpcls_lookup()->//rules重置->遍歷cls->subtables->subtable->lookup_func() //這個比較復雜,需要隨后熟悉->//計算為了這個包匹配所做的子表的檢索的次數,此舉是為了估算每個匹配包的子表的離散程度->//全部匹配了,早點返回// 6
dp_netdev_pmd_lookup_flow()->dp_netdev_pmd_lookup_dpcls()->dpcls_lookup()->dp_netdev_flow_cast()
// 6
queue_netdev_flow_put() //上面有用到->ovs_thread_create(dp_netdev_flow_offload_main)->dp_netdev_alloc_flow_offload()->dp_netdev_append_flow_offload() //插入dp_flow_offload這個list中
udpif_upcall_handler()
// 4
udpif_upcall_handler() //線程->recv_upcalls()->dpif_recv()->dpif->dpif_class->recv() //netdev沒有, netlink有->upcall_receive() //一般會進入這里,匹配流miss->pkt_metadata_from_flow() // 從struct flow的信息回填到pkt metadata->process_upcall() //解釋下面有->handle_upcalls()->ukey_install()->dpif_operate()->dpif->dpif_class->operate() // dpif_netdev_operate?// 5
dpif_netdev_operate()->dpif_netdev_flow_put()->flow_put_on_pmd()->dp_netdev_flow_add()->dpif_netdev_flow_del()->flow_del_on_pmd()->dp_netdev_pmd_remove_flow()->dpif_netdev_execute()->dp_netdev_execute_actions()->odp_execute_actions(dp_execute_cb) //某個OVS_ACTION_XXX會去調用dp_execute_cb->dp_execute_cb() //分OVS_ACTION_XXX進行處理->dp_netdev_pmd_flush_output_on_port()/dp_packet_batch_add() //OVS_ACTION_OUTPUT時,要么發出,要么放到發送對列->push_tnl_action() //OVS_ACTION_ATTR_TUNNEL_PUSH->netdev_push_header() //頭部添加->conntrack_execute()->ipf_preprocess_conntrack() //ip分片預處理->write_ct_md()->process_one_fast() //這里會處理nat->handle_nat()->un_nat_packet() / nat_packet()->process_one() //下面有->dp_netdev_pmd_flush_output_packets() //其他地方有詳情// 6
dp_netdev_flow_offload_main() //處理dp_flow_offload這個list中的請求->dp_netdev_flow_offload_put()->netdev_flow_put()->flow_api->flow_put() // netdev_offload_dpdk_flow_put
// 7
netdev_offload_dpdk_flow_put()->netdev_offload_dpdk_validate_flow()->netdev_offload_dpdk_add_flow()->netdev_offload_dpdk_actions()->netdev_offload_dpdk_mark_rss()->ufid_to_rte_flow_associate()
// 6
ipf_preprocess_conntrack() //ip分片預處理->ipf_extract_frags_from_batch() // 從batch的一堆包里提取出分片的包,->ipf_is_valid_v4_frag()/ipf_is_valid_v6_frag() //檢查是否是有效的ip分片->ipf_handle_frag() //處理成功之后,添加到ipf里面,并從batch里面刪除,不成功refill到batch里面->ipf_v6_key_extract()/ipf_v4_key_extract() //提取ipf key,根據key查找一個ipf list->ipf_process_frag()->ipf_execute_reass_pkts()->ipf_dp_packet_batch_add()
// 6
process_one() //執行之前->initial_conn_lookup()->如果已經是natted過的,conn_key_reverse() //數據包已經被nat了,源、目的IP port更換->conn_key_lookup() //查詢conn->//方向錯誤,且可以強制,則刪除->conn_lookup()->conn_clean()->//conn存在且類型是unnat的->conn_key_hash()->conn_key_lookup()->write_ct_md()->conn_update_state()->->//有nat信息且不是新創建的conn->handle_nat->//需要創建新的連接->conn_not_found()->//包不允許創建新conn,退出->//pkt->md.ct_state修改ct_state為 CS_NEW->//需要提交,即正式創建conntrack->//zone限制,直接返回->//數量限制,直接返回->//新建一個conn,設置主動發起的conn的key和rev_key->//有nat信息->//創建nat conn->//選ip和port,可nat的ip和port資源耗盡,退出->nat_packet() //對packet做nat處理->//對packet做nat處理,nat_conn的key是nc的rev_key,反之也是,key里的地址是轉換后的->nat_conn->conn_type = CT_CONN_TYPE_UN_NAT; //這里表示要nat還原->//nat_conn也會插入ct->conns->nc->nat_conn = nat_conn; //如果沒有nat,則nc->nat_conn是空的->nc->conn_type = CT_CONN_TYPE_DEFAULT; //正常的連接->write_ct_md() // 只有這一處填寫packet metadata的地方
udpif_revalidator()
// 4
udpif_revalidator() //線程->revalidate()->revalidator_sweep()->revalidator_sweep__()// 5
revalidator_sweep__()->push_ukey_ops()->push_dp_ops()->ukey_delete()->ukey_delete__()
// 5
revalidate()->dpif_flow_dump_thread_create()->dpif_emc_flow_dump_next() // 移動版本加的->dpif->dpif_class->emc_flow_dump_next() // dpif_netdev_emc_flow_dump_next->dpif_flow_dump_next()->dpif->dpif_class->flow_dump_next() // dpif_netdev_flow_dump_next->ukey_acquire()->ukey_create_from_dpif_flow() // 不推薦在此創建ukey,見代碼中注釋->ukey_install__()->revalidate_ukey()->revalidate_ukey__()->xlate_ukey()->xlate_push_stats()->reval_op_init()->push_dp_ops()// 6
dpif_netdev_emc_flow_dump_next->dp_netdev_pmd_get_next->cmap_next_position->dpif_emc_flow_timeout->dpif_dp_flow_del_by_emc->dp_pmd_remove_flow_by_emc
// 6
dpif_netdev_flow_dump_next->dp_netdev_pmd_get_next->cmap_next_position(pmd->flow_table)->dp_netdev_flow_to_dpif_flow->get_dpif_emc_flow_status_by_dp_flow
get_dpif_emc_flow_status_by_dp_flow->get_dpif_emc_flow_status
dp_emc_flow_to_dpif_flow->get_dpif_emc_flow_status
get_dpif_emc_flow_status->dpif_netdev_get_emc_flow_offload_status->netdev_ports_get->netdev_emc_flows_get->flow_api->emc_flows_get // netdev_offload_dpdk_emc_flows_get->dp_emc_flow_set_last_stats_attrs->dp_emc_flow_get_last_stats_attrs
netdev_offload_dpdk_emc_flows_get->netdev_dpdk_rte_flow_query_count->rte_flow_query// 6
push_dp_ops()->dpif_operate()->transition_ukey()->xlate_push_stats()->xlate_push_stats_entry()->xlate_push_stats_entry()->rule_dpif_credit_stats()
backer->need_revalidate()
// 4
backer->need_revalidate->//將當前tnl_backers->tnl_backers緩存到tmp_backers->每個ofproto類型 //其實只有ofproto_dpif,且需要backer一致->每個ofport_dpif->//非tnl port跳過->netdev_vport_get_dpif_port() // 貌似是tunnle的話才是vport?->//tmp_backers中有,從中移除,添加到backer->tnl_backers->//tmp_backers中沒有->dpif_port_add() //添加到backer->dpif,分配一個odp_port_t出來->dpif_class->port_add()->tnl_port_reconfigure()->dpif_port_del() //刪掉tmp_backers中不需要的->xlate_txn_start()->每個ofproto_dpif->xlate_ofproto_set()->每個ofproto_dpif的bundle->xlate_bundle_set()->每個ofproto_dpif的ports->xlate_ofport_set()->xlate_txn_commit()->udpif_revalidate() //變更序列號// 5
xlate_ofproto_set()->xlate_xbridge_init()->xbridge_addr_create()->xlate_xbridge_set()
ofproto_run()
// 2
ofproto_run()->ofproto_class->run()->bundle_run()->send_pdu_cb()->ofproto_dpif_send_packet()->connmgr_run() // 處理openflow的添加->ofconn_run()->rconn_run()->vconn_run()->vconn_recv()->do_send_packet_ins()->rconn_recv()->vconn_recv()->do_recv()->vconn_stream_recv()->ofptype_decode()->handle_openflow()
// 3
ofproto_dpif_send_packet()->xlate_send_packet()->ofproto_dpif_execute_actions()->ofproto_dpif_execute_actions__()->xlate_actions() //下面有->dpif_execute()->dpif_operate() // 別的地方有
// 3
handle_openflow()->ofptype_decode()->ofpraw_decode() // 解析成OFPRAW_NXT_FLOW_MOD OFPRAW_OFPT10_FLOW_MOD OFPRAW_OFPT11_FLOW_MOD 等 enum ofpraw類型->ofpraw_pull()->ofptype_from_ofpraw() // 從enum ofpraw類型解析成 enum ofptype->raw_info_get()->raw_infos //python生成的結構體->handle_table_features_request()->handle_single_part_openflow()->handle_packet_out() // switch的選項之一->ofproto_packet_out_start()->ofproto->ofproto_class->packet_xlate()->xlate_actions() //下面有->handle_flow_mod() // switch的選項之一->ofputil_decode_flow_mod()->ofpacts_pull_openflow_instructions()->get_actions_from_instruction()->ofpacts_decode()->ofpact_pull_raw()->ofpact_decode_raw()->ofpact_decode() //python生成代碼,里面調用decode_XXX->handle_flow_mod__()->ofproto_flow_mod_init() //初始化rule->add_flow_init() // OFPFC_ADD->cls_rule_init_from_minimatch()->ofproto_rule_create()->ofproto->ofproto_class->rule_alloc->rule_actions_create() //直接就是一塊struct ofpact的內存拷貝->ofproto->ofproto_class->rule_construct()->modify_flows_init_loose() // OFPFC_MODIFY->modify_flow_init_strict() // OFPFC_MODIFY_STRICT->delete_flows_init_loose() // OFPFC_DELETE->rule_criteria_init()->rule_criteria_require_rw()->delete_flows_init_strict() // OFPFC_DELETE_STRICT->ofproto_flow_mod_start() // 開始修改rule->add_flow_start()-> // temp rule是之前add_flow_init時候創建的-> // 從規則中獲取到action->rule_from_cls_rule() //查找是否有老規則-> //不存在老規則->choose_rule_to_evict() //超過最大規則數量限制了, 驅逐一條rule-> //存在老規則->rule_collection_add(ofm->old_rules) //添加到老規則列表里->rule_collection_add(ofm->new_rules) // 添加到新規則,上面都是處理ofm這個數據結構->replace_rule_start() // ofproto->tables->cls/刷新rule的有效期->ofproto_rule_insert__()->cookies_insert() //插入到ofproto->cookies->eviction_group_add_rule() //如果rule允許被驅逐,則加入到驅逐列表中->classifier_insert()->classifier_replace()->modify_flow_start_strict()->modify_flows_start__()->replace_rule_start()-> //存在舊規則則移除->ofproto_rule_insert__() //將流插入到原始數據結構中,以便以后的流與之相關。這是可逆的,以防以后的錯誤需要恢復。->ofproto_flow_mod_finish() // rule修改完成->add_flow_finish()->replace_rule_finish()->ofproto->ofproto_class->rule_insert() //只有old_rule存在時候才做點事情->handle_flow_stats_request()->ofputil_decode_flow_stats_request()->collect_rules_loose()->ofproto->ofproto_class->rule_get_stats->handle_bundle_control()->do_bundle_commit()
bridge_reconfigure()
// 2
bridge_reconfigure()->add_del_bridges()->bridge_delete_ofprotos()->對于all_bridges中的每個bridge->bridge_delete_or_reconfigure_ports()->對于all_bridges中的每個bridge->ofproto_create() //為每個網橋重新配置ofproto信息->對于all_bridges中的每個bridge->bridge_add_ports()->datapath_reconfigure()->對于all_bridges中的每個bridge->對于每一個port->port_configure()// 3
bridge_delete_or_reconfigure_ports()->對于br->ofproto中每個port->netdev_set_config()->netdev->netdev_class->set_config() //netdev_dpdk_set_config、netdev_dpdk_vdpa_set_config->add_ofp_port() //需要刪除的port綜合到一起->對于每一個需要刪除的port->ofproto_port_del() //這里主要是在dp上刪除->遍歷每一個port// 4
netdev_dpdk_vdpa_set_config()->netdev_dpdk_set_config()->rte_eth_dev_is_valid_port()->netdev_dpdk_process_devargs()->rte_eth_dev_is_valid_port()->rte_dev_probe()->netdev_dpdk_lookup_by_port_id()// 4
ofproto_port_del()->ofproto->ofproto_class->port_del() // port_del// 5
port_del()->dpif_port_del()->dpif->dpif_class->port_del() // dpif_netdev_port_del// 6
dpif_netdev_port_del()->do_del_port()->reconfigure_datapath()->port_destroy()// 7
reconfigure_datapath() //前面還有一部分解釋->對于每一個port->netdev_set_tx_multiq()->對于每一個port->檢查是否需要重新配置need_reconfigure->對于每一個需要重配的port->port_reconfigure()->netdev_reconfigure()->netdev_class->reconfigure() // netdev_dpdk_reconfigure、netdev_dpdk_vdpa_reconfigure// 8
netdev_dpdk_vdpa_reconfigure()->netdev_dpdk_reconfigure()->rte_eth_dev_reset()/rte_eth_dev_stop()->netdev_dpdk_mempool_configure()->dpdk_mp_get() // 這里會檢查是否有重用之類的->dpdk_mp_create()->設定netdev_dpdk的mempool->dpdk_eth_dev_init()->dpdk_eth_dev_port_config()->rte_eth_dev_info_get()->rte_eth_dev_configure()->rte_eth_dev_set_mtu()->rte_eth_dev_get_mtu()->rte_eth_tx_queue_setup()->rte_eth_rx_queue_setup()->rte_eth_dev_start()->rte_eth_promiscuous_enable()->rte_eth_allmulticast_enable()
// 3
add_del_bridges()->bridge_create()
// 3
ofproto_create() // reconfigure時針對每一個網橋調用,ofproto, ofport的生成->ofproto_normalize_type()->ofproto_class_find__() //找到對應的ofproto類別->class->alloc()->ofproto->ofproto_class->construct() // ofproto class的構造函數
// 4
ofproto_dpif_class->construct()->open_dpif_backer()->shash_add(&all_dpif_backers)->ofproto_init_tables()->hmap_insert(&all_ofproto_dpifs_by_name) //按照名稱存ofproto_dpif->hmap_insert(&all_ofproto_dpifs_by_uuid) //按照uuid存ofproto_dpif
// 5
open_dpif_backer()->dpif_create_and_open()->dpif_create()->do_open()->dpif_class->open() //dpif_netdev_open->netdev_open()->netdev_ports_insert()->dpif_open()->do_open()->udpif_create()->dpif_register_upcall_cb(upcall_cb)->dpif_class->register_upcall_cb() //netlink沒有,netdev有->check_support()// 6
dpif_netdev_open()->create_dp_netdev()->conntrack_init()// 6
upcall_cb()->upcall_receive() //查找ofproto->classify_upcall() //upcall分類,匹配用戶態action,miss或者非法的upcall->xlate_lookup()->xlate_lookup_ofproto_()->//是循環進來的流->recirc_id_node_find() //查詢循環信息,沒有查到循環信息,跳出->//非無效入口且不是從controller來的()->xport_lookup_by_uuid() //根據port uuid找->//是非循環進來的流->xport_lookup(tnl_port_should_receive, tnl_port_receive)->ofproto_dpif_lookup_by_uuid() // 用戶態action upcall->process_upcall()->upcall_xlate() // case選項之一,slow path->//統計->xlate_in_init() /* 默認frozen state是空的,但是當flow的recirc_id不為空,則根據id查找recirc_id_node,然后填充frozen stateodp_actions存放action最終結果 */->xlate_actions()->// 記錄為什么進到slowpath里->//用rcu技術將bridge port信息全部存下來->xlate_wc_init() //通配符->tnl_wc_init()->//凍結狀態,以下都是凍結狀態的恢復->//保留下老的,重新開始trace->//解凍->//已經有規則了,沖突退出?->//ofproto的uuid不匹配->//以frozen state中記錄的為準->//沒有找到bridge,退出->//沒有被跟蹤 ->clear_conntrack() ->frozen_metadata_to_flow() //恢復frozen state的metadata到flow的metadata->//有棧的話,恢復棧->//恢復鏡像狀態->//有recirc_id但是不是frozen state,出錯了->//獲取近似的input port,如果是凍結狀態,則flow->in_port是最終的input port->//如果是三層port來的非二層包,添加偽二層信息用于查詢->//沒有rule和action,查找rule->rule_dpif_lookup_from_table()->rule_dpif_lookup_in_table()->//統計非解凍的包->//不是凍結狀態,處理特殊的包,例如lacp,bfd,cfm->mirror_ingress_packet()->//丟棄從預留的鏡像端口的包->//除此之外的情況->//沒有凍結狀態->compose_ipfix_action()->tnl_process_ecn() //action轉換開始時候調用,封裝丟棄->mirror_ingress_packet() //包鏡像->do_xlate_actions()->freeze_unroll_actions() //需要退出,把后續的action給放到ctx->freeze_actions中->// 開啟跟蹤的話,相關跟蹤信息打印上->xlate_output_action() //case的選項之一->xlate_controller_action()/ctx_trigger_freeze() //執行控制器行為或者觸發凍結和退出->xlate_group_action() //case的選項之一->compose_conntrack_action() //case的選項之一->compose_slow_path()->ukey_create_from_upcall() // upcall是miss的時候才這么干->ukey_create__()->should_install_flow()->ukey_install()// 7
xlate_output_action()->compose_output_action() //case選項之一,輸出到本地、指定端口或者原路返回->compose_output_action__()->check_output_prerequisites() //檢查一系列狀態->//如果是以太網,則獲取三層協議類型->//xport->peer不為空表示是從bridge到bridge的->patch_port_output()->//xport是隧道->netdev_vport_inc_tx() //計數->ovs_native_tunneling_is_on() // 本地隧道和內核隧道->commit_odp_tunnel_action() //額外添加其他的action/* 從一個網橋發到另一個網橋,網橋通過patch port或者tunnel port相連。到另一個網橋的輸出action觸* 發在下一個網橋中轉換的繼續。這個過程可以是遞歸的,下一個網橋還可以發往再下一個。* 從第二個網橋之后的轉換了的action在clone action中被封閉,這樣任何對包的修改對原本網橋上的剩余action將是可見的 */->xxlate_output_actionlate_commit_actions() // 這里會轉換各種set操作->commit_odp_actions()->commit_set_nsh_action()->commit_nsh()->commit_set_nw_action()->commit_set_ipv4_action()->commit_set_ipv6_action()->patch_port_output()->process_special() //process_special處理特殊的協議->native_tunnel_output()->xlate_table_action() //case選項之一,跳轉到其他表->rule_dpif_lookup_from_table()->//存下原來的四層源目的地址->//從某個table開始網后找,跳過internal table->//匹配上了,即可退出,沒有匹配上,根據系統配置和傳入參數綜合判斷->//未匹配,根據傳入參數判斷miss之后繼續查找或者是發往控制器的話->ofp_port_to_ofport() //從ofproto找到port,該port沒有設置no_packet_in,則rule為ofproto->miss_rule->xlate_normal() //case選項之一, 按照內核規則來?->flood_packets() //case選項之一,洪泛->xlate_controller_action() //case選項之一,發往控制器
// 8
native_tunnel_output()->netdev_init_tnl_build_header_params()->tnl_port_build_header()->netdev_build_header()->netdev->netdev_class->build_header() // netdev_vxlan_build_header->odp_put_tnl_push_action()
// 7
xlate_group_action()->// 7
compose_conntrack_action()->xlate_commit_actions()->do_xlate_actions() //內部再次調用do_xlate_actions,解析nat和ct_mark,ct_label信息// 3
bridge_add_ports()->bridge_add_ports__()->iface_lookup()->iface_create()->iface_do_create()->ofproto_port_add()->ofproto->ofproto_class->port_add()
// 4
ofproto->ofproto_class->port_add()->dpif_port_add()->dpif->dpif_class->port_add() //dpif_netdev_port_add->netdev_ports_insert()
// 5
dpif_netdev_port_add()->netdev_vport_get_dpif_port()->do_add_port()->port_create()->netdev_open()->reconfigure_datapath() //前面有解釋// 3
port_configure() //ofbundle的生成->ofproto_bundle_register()->bundle_set()
netdev_run()
netdev_run()->netdev_initialize()->netdev_vport_tunnel_register()->netdev_register_provider() //netdev_classes->netdev_class->run() //dpdk netdev_class沒有
netdev class之dpdk_class
static const struct netdev_class dpdk_class = {.type = "dpdk",.init = netdev_dpdk_class_init, .destruct = netdev_dpdk_destruct, .set_tx_multiq = netdev_dpdk_set_tx_multiq, .get_carrier = netdev_dpdk_get_carrier, .get_stats = netdev_dpdk_get_stats, .get_custom_stats = netdev_dpdk_get_custom_stats, .get_features = netdev_dpdk_get_features, .get_status = netdev_dpdk_get_status, .reconfigure = netdev_dpdk_reconfigure, .rxq_recv = netdev_dpdk_rxq_recv.construct = netdev_dpdk_construct,.set_config = netdev_dpdk_set_config,.send = netdev_dpdk_send,NETDEV_DPDK_CLASS_COMMON,
};netdev_dpdk_eth_send()->netdev_dpdk_send__()->netdev_dpdk_eth_tx_burst()->rte_eth_tx_burst()netdev_dpdk_rxq_recv()->rte_eth_rx_burst()->dp_packet_batch_init_packet_fields()